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1.  FOREWARD 


INFORMATION  THEORY  ANALYSIS  FOR  DATA  FUSION 
Contract  DAAH04-96-R-BAA1 
November  30,  1997 

Final  Report  submitted  to  U.S.  Army  Research  Office,  Electronics  Division,  by 

Ronald  P.S.  Mahler,  Ph.D.,  for 

Lockheed  Martin  Tactical  Defense  Systems,  3333  Pilot  Knob  Rd.,  Eagan  MN  55121 

This  final  report  is  offered  by  Lockheed  Martin  Tactical  Defense  Systems,  Eagan  MN,  in 
compliance  with  the  reporting  requirements  for  research  conducted  during  the  last  three  years 
under  contract  DAAH04-94-C-0011.  Under  this  contract  Lockheed  Martin  has  developed  a 
unified,  information  theory-based  approach  to  information  fusion.  The  proposed  theoretical  tool 
is  "finite  set  statistics, "  a  special  case  of  random  set  theory  specifically  developed  as  a  part  of 
the  project.  Finite-set  statistics  results  in  a  systematic,  fully  probabilistic,  theoretical  unification 
of:  detection,  classification,  tracking,  decision-making,  sensor  management,  expert  systems 
theory,  and  performance  evaluation.  Highlights  of  this  work  are: 

(1)  Algorithms  for  simultaneous  optimal  estimation  of  numbers,  identities,  geokinematics  of 
targets,  along  with  control  of  sensor  dwells  and  modes; 

(2)  Rigorous  basis  for  fusion  of  "ambiguous"  (imprecise,  vague,  contingent)  observations; 

(3)  Systematic,  theoretically  justifiable  performance  evaluation  using  information  theory; 

(4)  Optimal  sensor  management  via  nonlinear  control  theory  and  information  theory. 

In  addition,  preliminary  work  was  begin  towards  developing  a  multisource,  multitarget  decision 
theory  potentially  applicable  to  Levels  2  fusion  (situation  assessment)  and  Level  3  fusion  (threat 
assessment).  Seventeen  conference  papers  resulted  from  this  work.  The  main  result  of  the 
project,  however,  was  a  book:  Mathematics  of  Data  Fusion  (Kluwer  Academic  Publishers),  co¬ 
authored  with  I.R.  Goodman  of  Naval  Research  and  Development  and  Prof.  H.T.  Nguyen  of 
New  Mexico  State  University  (Las  Cruces).  Chapter  2  and  Chapters  4  through  8  of  this  book 
were  devoted  to  work  completed  under  this  contract.  In  a  closely  related  activity,  the  P.I.  for 
this  project  was  principal  organizer,  co-chair,  and  co-editor  for  a  Workshop  on  Applications  and 
Theory  of  Random  Sets,  held  at  the  Institute  for  Mathematics  and  Its  Applications  (Minneapolis), 
jointly  sponsored  by  Office  of  Naval  Research,  the  Electronics  Division  of  U.S.  Army  Research 
Office,  and  Lockheed  Martin  Tactical  Defense  Systems.  The  proceedings  of  this  workshop 
appeared  in  hardcover  (Springer- Verlag)  in  November  1997. 
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4.  FINAL  REPORT 


Information  Theory  Analysis  for  Data  Fusion  (ITADF) 

Contract  DAAH04-96-R-BAA1 
November  30,  1997 

Final  Report  submitted  to  U.S.  Army  Research  Office,  Electronics  Division,  by 

Ronald  P.S.  Mahler,  Ph.D.,  for 

Lockheed  Martin  Tactical  Defense  Systems,  3333  Pilot  Knob  Rd.,  Eagan  MN  55121 

This  final  report  is  offered  by  Lockheed  Martin  Tactical  Defense  Systems,  Eagan  MN,  in 
compliance  with  reporting  requirements  for  work  completed  under  the  three-year  contract 
DAAH04-94-C-0011.  This  contract  expired  on  Nov.  30,  1997. 

4-A.  STATEMENT  OF  THE  PROBLEM  STUDIED 

4-A-a.  Summary  of  the  Problem  Studied.  Under  contract  DAAH04-94-C-0011  Lockheed 
Martin  developed  a  unified,  information  theory-based  approach  to  information  fusion.  The 
proposed  theoretical  tool  is  "finite  set  statistics"  (a  special  case  of  random  set  theory  specifically 
developed  as  a  part  of  the  project).  Finite-set  statistics  is  a  unified  statistical  calculus  which  (1) 
allows  multisensor,  multitarget  information  fusion  problems  to  be  treated  mathematically  in  the 
same  way  as  single-sensor,  single-target  problems;  and  (2)  provides  a  probabilistic  framework 
for  integrating  expert  systems  approaches  (e.g.  fuzzy  logic,  imprecise  evidence,  rule-based 
inference).  The  result  is  a  systematic  theoretical  unification  of:  detection,  classification, 
tracking,  decision-making,  sensor  allocation,  expert  systems  theory,  and  performance  evaluation. 
Highlights  of  this  work  are: 

(1)  Algorithms  for  optimal  simultaneous  estimation  of  numbers,  identities,  geokinematics  of 
targets,  along  with  the  optimal  control  of  sensor  dwells  and  modes; 

(2)  Rigorous  basis  for  fusion  of  "ambiguous"  (imprecise,  vague,  contingent)  observations; 

(3)  Systematic,  theoretically  justifiable  performance  evaluation  using  information  theory; 

(4)  Optimal  sensor  management  via  nonlinear  control  theory  and  information  theory. 

In  addition,  under  this  contract  preliminary  work  was  begin  towards  developing  a  multisource, 
multitarget  decision  theory  potentially  applicable  to  Levels  2  fusion  (situation  assessment)  and 
Level  3  fusion  (threat  assessment). 

Items  (1),  (2)  and  (3)  above  have  been  described  in  detail  in  Chapters  2,4,6, 7,8  of  Mathematics 
of  Data  Fusion  [12],  a  book  published  by  Kluwer  Academic  Publishers  in  September  1997  (co¬ 
authors:  I.R.  Goodman,  NRaD  Code  4221,  and  Prof.  H.T.  Nguyen,  New  Mexico  State 
University-Las  Cruces.)  Seventeen  conference  papers  also  were  produced  as  the  result  of  this 
work  completed  under  this  contract.  In  a  closely  related  activity,  the  P.I.  for  this  project  was 
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principal  organizer,  co-chair,  and  co-editor  for  a  Workshop  on  Applications  and  Theory  of 
Random  Sets,  held  at  the  Institute  for  Mathematics  and  Its  Applications  (Minneapolis),  jointly 
sponsored  by  ONR,  ARO,  and  Lockheed  Martin.  The  proceedings  of  this  workshop  appeared 
in  hardeover  (Springer- Verlag)  in  November  1997  [14]. 

This  work  has  attracted  considerable  favorable  attention  and  interest  in  both  the  DoD  information 
fusion  and  academic  communities.  The  P.I.  was  invited  to  speak  on  the  subject  at  several  DoD 
and  university  workshops  and  seminars,  including  the  Air  Force  Institute  of  Technology,  Naval 
Research  and  Development,  the  USAF  Tracking  and  Correlation  Symposium,  Harvard 
University  Department  of  Applied  Sciences,  the  Johns  Hopkins  Department  of  Electronic  and 
Computer  Engineering,  the  University  of  Massaschusetts  (Amherst)  Department  of  Electrical 
Engineering,  and  the  New  Mexico  State  University  (Las  Cruces)  Department  of  Mathematical 
Sciences.  He  has  also  been  invited  to  speak  on  the  same  subject  at  several  engineering 
conferences,  including  the  1995  IEEE  Conference  on  Decision  and  Control  and  the  1994 
National  Symposium  on  Sensor  Fusion  (formal  invitations)  and  the  1995  and  1998  SPIE 
AeroSense  Conferences  (informal  invitations). 

4-A-b.  The  Need  for  This  Work:  Technical  and  Scientific.  The  approach  investigated  in  this 
work  is  partially  motivated  by  the  fact  that  many  kinds  of  data  are  ambiguous  in  the  sense  that 
they  are  poorly  characterized  from  a  statistical  point  of  view.  Such  data  includes:  ambiguous 
features  or  attributes;  natural-language  statements;  and  rules.  The  lack  of  a  solid  probabilistic 
foundation  for  such  data  has  led  to  the  use  of  a  number  of  heuristic  approaches  such  as  fuzzy 
logic,  the  Dempster-Shafer  theory  of  evidence,  rule-based  inference,  etc. 

The  approach  is  also  motivated  by  the  fact  that  there  is  no  "level  playing  field"  for  determining 
the  performance  of  data  fusion  algorithms,  for  comparing  one  algorithm  to  another,  and  so  on. 

Last  but  not  least,  this  work  is  motivated  by  the  fact  that  Bayes-optimal  multitarget  estimation 
and  filtering  encounters  fundamental  conceptual  difficulties  when  the  number  of  targets  is 
unknown.  When  one  tries  to  apply  the  standard  statistical  thinking  just  described  to  the 
multitarget  case  with  unknown  number  of  targets,  one  quickly  discovers  that  taking  things  for 
granted  leads  to  serious  troubles  that  directly  bear  on  practice.  First,  how  do  we  uniquely 
specify  all  of  the  states  that  the  multitarget  system  can  occupy?  Multitarget  states  must  look 
something  like  this: 

0  :  no-target  state 
Xj  :  the  single-target  states 
Xi  ,X2  :  the  two-target  states 
Xi  ,...,Xt  :  the  t-target  states 

However,  this  specification  of  states  is  incomplete.  The  symbol  x,x  signifies  not  that  a  single 
target  with  state  x  is  present  twice,  but  rather  that  two  completely  different  targets  happen  to 
occupy  the  same  kinematic  state  x.  Strictly  speaking,  therefore,  the  state  of  an  individual  target 
is  not  fully  specified  (in  a  multitarget  context)  unless  a  unique  identifier  (an  "LD.  tag")  has  been 
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attached  to  it,  e.g.:  (x,t)  where  x  is  the  target’s  kinematic  state  and  t  is  its  unique 

identifying  tag.  Thus  the  incompletely  specified  two-target  state  x,x  should  be  replaced  by  the 
completely-specified  two-target  state  (x,Tj),  (x^tJ.  Two- target  states  of  the  form  (x,t),  (y,T) 
with  X  7^  y  must  be  excluded  as  non-physical  since  the  same  target  cannot  occupy  two 
different  kinematic  states  simultaneously  (see  [12],  pp.  194-198).  Also,  note  that  the  two-target 
state  (xj  ,Ti),  (X2 ,72)  and  the  two-target  state  (X2 ,7-^,  (Xj  ,7j)  are  not  distinct:  They  represent 
the  same  two-target  state. 

Putting  such  subtleties  aside  for  the  moment,  let  us  proceed  immediately  to  a  declaration  of 
victory  and  assume  that  we  can  define  a  multitarget  posterior  distribution  function  on  the  (thus 
far  vaguely  specified)  multitarget  state  space.  To  keep  things  simple,  assume  that  all  targets  are 
motionless,  exist  in  one  dimension,  and  are  completely  specified  by  their  locations  on  the  real 
line  as  measured  in  meters.  Also  assume  that  a  single  sensor  collects  a  set  Z  of  measurements 
from  the  targets,  whose  number  as  well  as  positions  are  unknown  and  are  to  be  estimated.  One 
can  write  a  naive  posterior  distribution  function/(s't£zre  |  Z)  on  the  multitarget  state  space,  given 
measurements  Z,  as  follows: 

f(0  I  Z)  =  posterior  likelihood  of  zero  targets 
f(Xj  I  Z)  =  posterior  likelihood  of  one  target  in  state  Xj 
f(Xj  ,X2  I  Z)  =  posterior  likelihood  of  two  targets  in  state  Xj  ,X2 
f(Xj  ,...,x,\Z)  =  posterior  likelihood  of  t  targets  in  state  Xj  ,...,x, 

Since  the  cumulative  likelihood  summed  over  all  multitarget  states  must  be  i,  it  follows  that 

f(0\Z)  +f(l\Z)  +  ...  +f(t\Z)  +  ...  +f(M\Z)  =  1 

where  f(0  \  Z)  =  f(0  \  Z),  where  f(t  |  Z)  =  ]  f(Xi  ,...,x,  \  Z)  dxi  •  • '  dx,  for  r  = 
and  where  M  is  the  maximum  expected  number  of  targets  in  the  scenario.  Now,  let  us  naively 
use  the  MAP  procedure  to  estimate  the  complete  state  of  the  multitarget  system.  Then  we  would 
write: 


x^  X-  =  argsup  f(x^,...,x^\Z) 


where  t  is  the  estimated  number  of  targets  and  the  x,  are  the  estimated  positions  of  the  t 
targets.  Unfortunately,  there  is  a  problem:  From  the  definition  of  a  Riemaim  integral  we  know 
that: 


•  f(0  I  Z)  =  unitless  probability 

•  f(l  I  Z)  =  unitless  probability,  so  units  of  /(X;  |  Zj  must  be  1 /meter  since  the  units 

of  dXj  are  in  meters 

•  f(2  I  Z)  =  unitless  probability,  so  units  of  /(x^  ,X2  \  Z)  must  be  1 /meter ^ 

•  f(t\Z)  =  unitless  probability,  so  units  of  /(X;  ,...,x,  |  Z)  must  be  1 /meter* 
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Consequently  the  "argsup”  operation  is  not  defined  since  the  quantities  f(0  |  Z),f(Xj  \  Z) 
f(Xj  ,...,x,\Z) are  incommensurable  with  respect  to  units  of  measurement.  Thus  the  MAP 
estimator  cannot  even  be  defined!  One  might  try  to  sidestep  this  problem  by  using  Riemann- 
Stieltjes  integrals.  That  is,  let  g(x)  be  an  arbitrary  density  function  on  (single-target)  state 
space  and  let  G(x)  =  \  f  8(y)  ^  t»e  its  corresponding  cumulative  probability  function.  Then 
one  could  instead  define  multitarget  densities  using  Riemann-Stieltjes  integrals 

h(t  I Z)  =  J  h(x^  I Z)  dG(x^)  -dG(x^)  =  j  h(x^  ,...,x^  \  Z)  gCjCj)  -g(x^)  dx^  •• 


where,  now,  the  multitarget  distributions 

/z(Xi,...,xJZ) 


f(Xi  ,...,xJZ) 
g(Xi)  -gix,) 


are  unitless.  Then,  the  multitarget  distribution  h  will  not  have  the  incommensurability-of-units 
problem  just  noted  and  so  could  be  used  to  define  a  multitarget  MAP  estimate.  The  price, 
however,  is  the  introduction  of  an  arbitrary  "fudge  factor”— the,  density  g~into  the  concept  of 
a  multitarget  posterior  distribution  h. 


If  we  instead  turn  to  the  posterior  expectation  for  our  salvation,  our  troubles  get  even  worse. 
A  multitarget  posterior  expectation,  if  it  exists,  must  have  the  general  form  \  {xj  ,...,x,)  f({xj 
,...,x,)  I  Z)  d{xj  ,...,Xi)  where  the  (as  yet  to  be  defined)  integral  is  taken  over  all  (thus  far 
vaguely  defined)  multitarget  states  {xj , . . .,x,).  Such  an  integral  cannot  even  be  defined  unless, 
at  minimum,  the  multitarget  state  space  is  a  vector  space— in  particular,  unless  it  has  a  concept 
of  addition/subtraction.  But  how  does  one  add  the  zero-target  state  0  to  a  single-target  state 
X?  Or  a  single-target  state  x  to  a  two-target  state  Xj  ,X2?  We  could  attempt  to  address  this 
problem  by  embedding  the  multitarget  state  space  in  a  larger,  enveloping  space  which  is  a  vector 
space.  In  this  case,  however,  there  is  no  guarantee  that  the  posterior  expectation  would  yield 
values  which  are  actual  multitarget  states.  Rather,  it  is  more  likely  that  it  would  yield  values 
which  are  strictly  in  the  enveloping  vector  space  and  therefore  which  would  have  no  physical 
meaning.  Having  been  denied  the  accustomed  security  of  the  classical  estimators,  we  are 
therefore  forced  to  propose  new  ones  and  show  that  they  are  statistically  well-behaved. 


Once  such  difficulties  are  exposed,  still  others  are  driven  into  the  sunlight.  For  example,  the 
multitarget  posterior  densities  f(Xi  ,...,x,  \  Z)  cannot  be  defined  unless  one  has  at  hand  a 
multitarget  measurement  model  f(zi  .....Zk  \  Xi  ,...,x)  which  tells  us  the  likelihood  of  seeing 
measurements  Zj  given  the  presence  of  targets  with  states  x^  .  Even  if  a 

multitarget  MAP  estimator  could  be  defined,  we  would  still  need  to  know  that  the  multitarget 
likelihood  function  is  measurable  in  the  variables  Zj  ,...,Zk  and  continuous  in  the  variables  Xj 
,...,x,.  But  measurable  with  respect  to  which  topology  on  the  multitarget  measurement  space? 
Continuous  with  respect  to  which  metric  on  the  multitarget  state  space?  The  latter  question  is 
far  from  trivial.  What  is  the  distance  between  a  single-target  state  x  and  a  two-target  state  x^ 
,xfl  Or  the  zero-target  state  0  and  the  single-target  state  xf.  What  is  the  distance  between 
the  two-target  states  X;  ,X2  and  X2  ,Xj  if  Xj  7^  xfl  (The  Euclidean  metric  gives  us  ||  (Xj  ,^2)- 
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(X2  ,Xj)  II  =  II  (Xj-X2  .X2-Xj)  II  9^  0.  But  the  state  "an  F-16  at  Xj  flown  by  Joe  and  an  F-22  at 
X2  flown  by  Ralph"  is  the  same  multitarget  state  as  "an  F-22  at  X2  flown  by  Ralph  and  an  F-16 
at  X]  flown  by  Joe".)  One  can  then  try  to  tinker  various  metrics  for  multitarget  state  space, 
only  to  get  pulled  into  a  a  morass  of  arbitrary,  ad  hoc  defmitionizing.  If  these  questions  are  not 
answered  we  cannot  even  define  a  likelihood  function;  To  do  so,  we  need  to  (1)  precisely  define 
the  state  and  measurement  spaces,  (2)  define  topologies  on  the  state  and  measurement  spaces, 
(3)  define  random  variables  on  the  state  and  measurement  spaces  using  these  topologies,  and  (4) 
define  the  multisource,  multitarget  likelihood  function  as  a  conditional  distribution  in  terms  of 
these  random  variables.  Moreover,  how  can  we  systematically  construct,  from  a  knowledge  of 
the  Markov  state  transition  models  of  the  individual  targets,  a  Markov  sstate-transition  for  the 
entire  multitarget  state-as  opposed  to  just  assuming  that  it  cbmes  out  of  nowhere,  deus  ex 
machinal  Likewise,  how  can  we  systematically  construct  the  multitarget  likelihood  function 
f(Zi  ,...,4  I  Xj  ,...,x)  from  a  knowledge  of  the  characteristics  of  the  individual  sensors,  without 
assuming  that  it,  too,  comes  out  of  nowhere  deus  ex  machinal  (Note  that  in  the  single-sensor 
target  case  we  can  explicitly  construct  likelihood  functions  directly  from  our  knowledge  of 
sensor  characteristics.  For  example,  suppose  that  we  know  that  the  sensor  is  described  by  the 
standard  Kalman  measurement  model  z  =  Cx  -h  w  where  w  is  random  noise  with  density 
fi,(y)  and  C  is  a  matrix.  Then  the  probability  measure  (mass  function)  for  z  is 

I  sfz(y  \x)dy  =  Pr(z  E  S)  =  Pr(Cx+w  E  S)  =  Pr(w  E  S-Cx)  =  \  s-cxfJy)  dy  = 

i  sfJy-Cx)  dy 

where  S-Cx  denotes  the  set  of  all  s  -  Cx  with  s  E  S.  Since  this  is  true  for  all  measurable 
S,  the  likelihood  function  is  fjy  \  x)  -  fi(y-Cx)  almost  everywhere  in  y.  In  other  words: 
The  likelihood  function  fJy  \  x)  is  constructed  as  the  Radon-Nikodym  derivative  fi  =  dpjdk, 
with  respect  to  Lebesgue  measure  X,  of  probability  measure  pJS)  =  Pr(z  E  S).) 

In  summary,  if  we  take  things  for  granted  and  simply  declare  victory,  we  are  led  into 
fundamental  conceptual  difficulties  that  have  a  direct  bearing  on  engineering  practice.  What  is 
at  issue  is  neither  theoretical  hair-splitting  nor  mere  mathematical  "bookkeeping."  Rather,  what 
is  actually  at  stake  is  our  ability  to  do  optimal  multitarget  filtering  and  estimation  at  all  and, 
moreover,  our  ability  to  even  know  what  "optimal"  means  in  such  a  context. 

4-A-c.  The  Need  for  This  Work:  USARO  Technology  Requirements.  The  proposed  work 
directly  addresses  the  following  problem  areas  described  in  the  1995  USARO  Broad  Area 
Announcement: 

•  " . .  .determining  information-theoretic  bounds  on  the  performance  of  any  algorithm,  given 
parameters  of  the  sensor  and  the  scene."  (p.  19) 

•  "Processes  such  as  target  detection,  target  recognition  and  classification,  and  tracking 
often  require  fusion  of  information  from  potentially  diverse  sources.  The  increasing 
volume  and  variety  of  battlefield  data  makes  the  "higher"  levels  of  information  fusion- 
situation  and  threat  assessment-increasingly  crucial.  Decision-making  at  these  higher 
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fusion  levels  often  involves  forms  of  ambiguity  more  extreme  than  those  addressed  by 
conventional  statistical  analysis;  imprecision,  vagueness,  indiscernability,  etc.  Therefore, 
it  is  essential  that  information  from  sources  such  as  images,  signals,  voice  messages, 
geographical  information,  natural  language  text,  and  prior  knowledge/rule  bases  be 
presented  in  a  unified  framework.  Mathematical  methods  for  measuring  and  representing 
information  content  in  diverse  and  ambiguous  data  are  fundamental.  Recent  approaches 
(Dempster-Shafer,  fuzzy  logic,  rule-based,  rough  sets,  statistical  capacities,  etc.)  offer 
a  way  of  dealing  with  highly  ambiguous  information.  A  systematic,  tractable  framework 
is  needed  that  will  allow  diverse  input  data  streams  to  be  transformed  into  a  unified 
information  fusion  space.  Such  measures  should  enable  prediction  of  the  level  of  system 
performance  achievable  based  upon  the  information  content  of  sources  available, 
knowledge  gained  from  previous  experience,  tasks  o  be  performed,  and  constraints  in  the 
context  of  the  task  to  be  performed."  (p.  19) 

•  "In  higher  levels  of  information  fusion  such  as  situation  and  threat  assessment  and 
mission  management,  the  primary  objective  is  to  simulate  a  human  expert  (e.g.  rule- 
based  system,  adaptive  system,  neural  networks,  etc.).  Issues  which  require  further 
research  include:  more  systematic  approaches  to  situation  assessment  which  permit 
effecive  performance  analysis,  prediction,  and  evaluation;  and  assessment  of  measures 
of  stability  and  performance.  A  common  methodology  should  be  developed  that  would 
support  the  optimal,  adaptive  management  of  information  collection  resources. "  (p.  20) 

•  "The  fundamental  concept  of  a  system  that  can  process  artificially  sensed  information, 
make  optimal  decisions  based  on  this  information  and  well-defined  objecives  and  translate 
these  decisions  into  actions  is  a  guiding  and  unifying  theme  for  basic  research  in  all 
major  aspects  of  this  area  [Foundations  of  Intelligent  Control  Systems]."  (p.  49) 

In  addition,  the  proposed  directly  and  explicitly  addresses  the  following  requirements  set  forth 
in  the  report  of  the  USARO  Myrtle  Beach  Electronics  Strategy  Planning  Workshop,  Jan.  9-12 
1995: 


Thrust  IA-7  Mathematical  Representations  and  Measures  to  Unify  Decision-Making 
Based  on  Diverse  Information  (Extremely  High  Priority).  "Processes,  such  as  target 
detection,  target  recognition  and  classification,  and  tracking  often  require  fusion  of 
information  from  potentially  diverse  sources.  The  increasing  volume  and  variety  of 
battlefield  data  makes  the  higher  levels  of  information  fusion— situation  and  threat 
assessment,  mission  management-increasingly  crucial.  Decisionmaking  at  these  higher 
fusion  levels  often  involves  forms  of  ambiguity  more  extreme  than  those  addressed  by 
conventional  statistical  analysis:  imprecision,  vagueness,  indiscernability,  partial 
contingency,  etc.  Therefore,  it  is  essential  that  information  from  sources  such  as  images, 
signals,  voice  messages,  geographical  information,  natural  language  text,  and  prior 
knowledge/rule  bases  be  presented  in  a  unified  framework.  Mathematical  methods  for 
measuring  and  representing  information  content  in  diverse  and  ambiguous  data  are 
fundamental.  Classical  statistical  decision  theory  provides  methods  for  dealing  with 
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uncertain  information  when  the  underlying  probabilities  are  known  or  can  be  treated 
subjectively.  Recent  approaches  (Dempster-Shafer,  fuzzy  logic,  rule-based,  rough  sets, 
statistical  capacities,  etc.)  offer  a  way  of  dealing  with  highly  ambiguous  information. 
These  methods  utilize  a  variety  of  information-quality  measures:  possibility,  belief, 
entropy,  etc.  A  systematic,  tractable  framework  is  needed  that  will  allow  diverse  input 
data  streams  to  be  transformed  into  a  unified  information  fusion  space.  This  framework 
should  provide  systematic,  tractable  measures  of  information  quality.  Such  measures 
should  enable  prediction  of  the  level  of  system  performance  achievable  (see  Thrust  IA-4) 
based  upon  the  information  content  of  sources  available,  knowledge  gained  from  previous 
experience,  tasks  to  be  performed,  and  constraints  in  the  context  of  the  task  to  be 
performed."  (pp.  40-41) 

Thrust  IA-8  Multi-Criterion,  Multi-Expert,  and  Multi-Source  Decision  Analysis  (Very 
High  Priority).  "In  conventional  approaches  to  optimization,  a  key  assumption  is  that 
the  performance  of  a  system  can  be  assessed  by  a  single  criterion,  e.g.  cost.  In  many 
real-world  situations  this  is  not  the  case.  Furthermore,  performance  assessments, 
decisions  and/or  estimates  may  be  provided  by  a  number  of  experts  or  fusion  sources, 
each  employing  different  evaluation  criteria  and  using  possibly  overlapping  data  sources. 
A  common  examples  [sic]  is  "track-to-track"  fusion,  in  which  existing  and  possibly 
correlated  or  conflicting  estimates/decisions  must  be  fused  into  a  valid  composite  picture. 
Available  methods  are  hard  to  apply  and  are  lacking  in  computational  efficiency. 
Techniques  are  needed  which  are  capable  of  representing  preferences,  expert  credibilities, 
weights  of  criteria  importance,  and  data  dependencies  in  qualitative  terms  that  lead  to  an 
aggregated  choice  of  alternatives  which  are  preferable  or  admissible  but  not  necessarily 
optimal."  (p.  41) 

Thrust  IA-9  Methodologies  for  Situation  Analysis  (High  Priority) .  "In  higher  levels  of 
information  fusion  such  as  situation  and  threat  assessment  and  mission  management,  the 
primary  objective  is  a  simulation  of  a  human  expert  (e.g.  rule-based  system,  adaptive 
system,  neural  networks,  etc.)  which  approximates  the  expertise  of  the  human.  Issues 
which  require  further  research  include:  more  systematic  approaches  to  situation 
assessment  which  permit  effective  performance  analysis,  prediction,  and  evaluation;  and 
assessment  of  measures  of  stability  and  performance.  A  common  methodology  should 
be  developed  that  would  support  the  optimal,  [sic]  of  information  resources."  (p.  42) 

4-B.  SUMMARY  OF  MOST  IMPORTANT  RESULTS 

4-B-a.  Background  and  Objectives.  In  this  section  we  will  summarize  the  major  findings  of 
the  work  completed  thus  far  under  contract  DAAH04-94-C-0011.  The  basic  problems  of 
data/information  fusion  are  summarized  in  [1,  17,  60].  During  the  current  project  Lockheed 
Martin  constructed  a  rigorous,  fully  probabilistic  scientific  foundation  for  the  following  aspects 
of  information  fusion: 

(1)  Multisource  integration  based  on  parametric  estimation  and  Markov  techniques  [36,  38, 
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40,  42,  43,  44] 

(2)  Prior  information  regarding  the  numbers,  identities,  and  geokinematics  of  targets  [36] 

(3)  Sensor  management  based  on  information  theory  and  nonlinear  control  theory  [32] 

(4)  Performance  evaluation  using  information  theory  and  nonparametric  estimation  [39,  41] 

(5)  Expert-systems  theory:  fuzzy  logic,  evidential  theory,  rule-based  inference  [29,  30,  35] 

The  existence  of  this  unification  approach  implies  the  existence  of  algorithms  which  fully  unify, 
in  a  single  statistical  process,  the  following  functions  of  information  fusion: 

•  Target  detection 

•  Target  identification 

•  Target  tracking  and  localization 

•  Prior  information  with  respect  to  detection,  classification,  and  tracking 

•  Ambiguous  evidence  and  precise  data 

•  Evidential  rules  of  combination 

•  Sensor  management,  including  optimal  selection  of  sensor-control  parameters 

Not  all  sensors  can  be  directly  subsumed  within  this  unification,  however.  We  assume  the 
following  sensor  types:  (1)  point-source,  (2)  range-profile,  (3)  line-of-bearing,  (4)  human 
observers  reporting  in  namral  language,  (5)  rulebases,  (6)  imaging  sensors  whose  target-images 
are  point  "firefly"  sources,  (7)  imaging  sensors  whose  target-images  consist  of  relatively  small 
clusters  of  point  energy  reflectors  (also  sometimes  called  ’’extended  targets”). 

Random  set  theory  was  developed  by  Kendall  [25]  and  Matheron  [45]  in  the  context  of  stochastic 
geometry.  Since  the  late  1970s,  several  researchers  have  investigated  the  connections  between 
random  sets  and  fuzzy  logic  [9,  11,  51],  the  Dempster-Shafer  evidential  theory  [8,  18,  31,  37, 
49,  57],  rule-based  inference  [33,  34],  general  expert  systems  theory  [13,  15,  26,  53],  and  data 
fusion  [10,  47,  48].  The  work  performed  under  the  current  contract  has  built  upon  and  greatly 
extended  this  existing  body  of  research,  resulting  in  a  systematic  and  general  mathematical 
apparatus  for  solving  information  fusion  problems. 

The  basic  approach  is  as  follows.  A  suite  of  known  sensors  transmits  to  a  central  data  fusion 
site  the  observations  they  collect  regarding  a  group  of  targets  whose  numbers,  positions, 
velocities,  identities,  threat  states,  etc.  are  unknown.  Finite-set  statistics  arises  when  we 
mathematically  reformulate  the  multisensor,  multisource  problem  as  a  single-sensor,  single¬ 
source  problem: 

•  A  group  of  sensors  (O^  reporting  to  a  central  data  fusion  site  is  modeled  as  a 

single  "global"  sensor  G* 

•  An  unknown  number  of  targets  (0i  ,...,0J  is  modeled  as  a  single  "global"  target  0* 

•  A  group  of  reports  >  collected  when  the  sensors  interrogate  the  targets  at  a 

given  instant,  is  modeled  as  a  single  "global"  report  Z  =  {^j 
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Simplifying  somewhat,  each  individual  target  state  will  have  the  form 

f  =  (x.v,j) 

where  x  is  a  continuous  state  vector  (e.g.  geokinematics),  v  is  a  discrete  state  variable  (e.g. 
target  class)  and  j  is  a  (generally  unknown  and  unknowable)  unique  identifying  tag  associated 
with  each  target.  The  complete  state  of  a  multitarget  system  is,  therefore,  represented  by  a  "set 
parameter"  of  the  form 


which  is,  in  general,  a  specific  value  of  a  randomly  varying  parameter-set  F. 

Likewise,  each  individual  observation  will  have  the  form 

I  =  (z,u,i) 

where  z  is  a  continuous  variable  (geokinematics,  signal  intensity,  etc.)  in  R",  where  «  is  a 
discrete  variable  (e.g.  possible  target  attributes)  drawn  from  a  finite  universe  U  of  possibilities, 
and  where  i  is  a  "sensor  tag"  which  identifies  the  sensor  which  supplied  the  measurement.  If 
the  total  observation-set  (the  global  observation) 

Z  =  {(, . w 

collected  by  the  global  sensor  is  treated  as  a  single  entity,  then  it  is  a  specific  value  of  a 
randomly  varying  finite  observation-set  E. 

4-B-b.  Random  Set  Formulation  of  Data  Fusion  Problems.  The  global  report,  which  varies 
randomly  in  regard  to  kinematics,  identities,  and  numbers  of  elements,  is  2l  finite  random  set, 
say  E.  Given  this,  it  becomes  possible  in  principle  to  reformulate  the  multisensor,  multitarget 
data  fusion  problem  as  a  single-sensor,  single-target  problem.  The  statistics  of  the  finite  random 
set  E  are  determined  not  by  a  probability  measure  but  rather  by  the  "belief  measure"  |  Z) 
=  p(L^S)  which  belongs  to  a  family  of  belief  measures  parametrized  by  a  parameter  X  which 
consists  of  a  finite  subset  of  (conventional)  parameters. 

In  more  detail,  from  Matheron’s  random  set  theory  we  know  that  the  class  of  finite  subsets  of 
measurement  space  has  a  topology,  the  so-called  hit-or-miss  topology  [45,46].  If  O  is  any 
Borel  subset  of  this  topology  then  the  statistics  of  E  are  characterized  by  the  associated 
probability  measure 


p^(0)  =  p(LeO) 

The  Choquet-Matheron  capacity  theorem  (see  [45],  pg.  30)  tells  us,  among  other  things,  that  we 
need  only  consider  Borel  sets  of  the  specific  form 
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i.e.,  the  class  whose  elements  are  all  closed  subsets  C  of  measurement  space  such  that  C  n 
7^  0  (i.e.,C£5)  where  S  is  some  closed  subset  of  measurement  space.  In  this  case 

Pz(.Osc)  =  pi^^S)  = 

The  set  function  is  known  as  the  belief  measure  of  the  randomly  varying  finite  subset  E, 
or  alternatively  as  an  "infinitely  monotone  capacity."  Whereas  a  probability  measure  q  is 
"additive"  (i.e.,  qiSUT)  =  q(S)  +  q(T)  if  SC\T  =  0)  a'belief  measure  is,  in  general, 
nonadditive:  fi(S^T)  >  fi(S)  +fi(T)  if  SOT  =  0.  Thus  we  can  use  the  measure 

jSj;  defined  on  subsets  of  ordinary  observation  space  instead  of  the  additive  measure  ,  which 
is  defined  on  subsets  of  the  class  of  finite  sets  of  observation  space. 

Despite  the  fact  that  the  belief  measure  is  nonadditive,  it  plays  the  same  role  in  multisensor, 
multitarget  statistics  that  ordinary  probability  measures  play  in  single-sensor,  single-target 
statistics.  The  reason  is  that  from  the  belief  measure  we  can  construct  a  "global  density 
function"  f^(Z)  which  describes  the  comprehensive  statistical  behavior  of  the  entire  sensor  suite. 
Just  as  the  density  of  a  random  vector  can  be  derived  from  its  cumulative  probability  function 
through  iterated  differentiation,  so  the  global  density  of  a  finite  random  set  can  be  derived  from 
its  belief  measure  via  iteration  of  a  generalized  form  of  the  familiar  Radon-Nikoc^m  derivative 
from  Lebesgue  measure  theory. 

In  what  follows,  —  R"  X  C/  x  {l,...,s}  denotes  a  "hybrid  space"  of  continuous  and  discrete 
observation  variables,  where  U  is  some  finite  set.  That  is,  a  typical  element  ^  =  (z,u,j)  G 
9?  is  a  triple  which  denotes  a  continuous  (e.g.  geopositional)  observation  z  G  M",  a  discrete 
observation  u  E  U  (e.g.  a  target  identity,  target  class,  or  other  discrete  identity-type  attribute), 
and  1  <  j  <  s  is  an  integer  "sensor  tag"  which  identifies  the  sensor  which  collected  the 
continuous-discrete  observation  (z,u).  Let  Z  =  be  a  finite  subset  of  9?  with 

being  distinct. 

4-B-c.  An  Integral  and  Differential  Calculus  for  Data  Fusion.  The  theoretical  core  of  the 
random  set  approach  is  a  generalization  of  integral  and  differential  calculus  to  certain  types  of 
set  functions  (e.g.  belief  measures).  The  set  integral  and  its  inverse  operation  the  set  derivative 
play  somewhat  the  same  role  in  multisensor,  multitarget  problems  that  the  conventional  integral 
and  derivative  play  in  single-sensor,  single-target  problems. 

If  an  measure  />z(5)  =  p(L  G  S)  of  a  random  vector  Z  is  absolutely  continuous  with 

respect  to  Lebesgue  measure  X  then  by  the  Radon-Nikodym  theorem  one  can,  in  principle, 
determine  the  density  function 
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that  corresponds  to  it.  Conversely,  the  measure  can  be  recovered  from  the  density  through 
application  of  the  Lebesgue  integral 

^^(2)  =  p^(S) 


We  have  shown  how  to  define  an  integral  and  differential  calculus  of  functions  of  a  set  variable 
which  obeys  similar  properties.  Given  a  vector- valued  function  f(Z)  of  a  finite-set  variable 
Z  we  have  shown  how  to  define  a  "set  integral"  of  the  form  J  s/(Zj  6Z.  Conversely,  given  a 
vector- valued  function  #(5)  of  a  closed  set  variable  S,  we  have  shown  how  to  constructively 
define  a  "set  derivative"  of  the  form  6$/5Z.  Under  certain  assumptions,  these  operations  turn 
out  to  be  inverse  to  each  other; 


f  —(0)  5Z  =  0(S) 

Js8Z 


_6_ 

5Z 


f^fiZ)  5Z 


f(Z) 


In  more  detail,  the  global  density  f^(Z)  is  derived  from  the  belief  measure  |  X)  by  an 
iterated  differentiation; 


/j(Z) 


65.-65. 


(0) 


This  differentiation,  called  "generalized  Radon-Nikodym  differentiation,"  is  defined  by 

A (r u  (£x {« ) X (D 

- (T)  =  lim  - 

65  E^{z)  X(E) 

for  any  5  =  (z,uj)  G  where  E  is  a  closed  ball  in  M"  centered  at  z  which  converges  to 
the  point-set  {z},  and  where  \(E)  denotes  the  Lebesgue  measure  (hypervolume)  of  E  in  R". 
(This  is  a  somewhat  simplified  definition,  assuming  |  T.  Also,  the  limit  as  E  approaches 
the  singleton  set  (z)  requires  careful  treatment  based  on  the  constructive  definition  of  the 
Radon-Nikod^  derivative  [56,  61],  as  described  in  section  4.2.3  of  Chapter  4  of  [12]).  The 
belief  measure  can  be  recovered  from  the  global  density  via  integration, 

A(S)  =  /j  A(Z) 
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which,  setting  5  =  9?,  yields  the  normality  condition  f  fi^(Z)  8Z  =  1.  Here,  the  integral  is 
a  so-called  set  integral,  which  is  defined  as  follows.  Let  #{Zj  be  a  function  whose  arguments 
are  finite  sets  Z.  Then 

I  ^(Z)SZ  =  E  ^ /j.  . i,))dx((,)-dxa,) 

where  the  integrals  on  the  right-hand  side  of  the  equation  are  "hybrid  integrals"  and  where  5* 
=  S  X  ...  X  S  denotes  the  Cartesian  product  of  S  taken  with  itself  k  times.  The  "hybrid 
integral"  (i.e.,  the  integral  defined  in  terms  of  the  product  measure  of  Lebesgue  measure  on  M" 
and  the  counting  measure  on  C7)  of  a  function  <f>(^)  whose  arguments  are  ^  =  (z,u)  E  dt  is 
defined  by: 

f  ^a)dxa)  =  E  L.  m,u))dx(z) 

Js  JS(«) 

for  any  5  £  9?,  where  S(u)  denotes  the  subset  of  all  z  G  R"  such  that  (z,u)  E  S. 

4-B-d.  The  Global  Density  of  a  Sensor  Suite.  Given  certain  absolute  continuity  assumptions 
which  need  not  concern  us  here,  if  is  the  belief  measure  of  a  random  observation  set  E 
then  the  quantity 

/.(Z)  =  -^(0) 

^  5Z 

is  the  density  function  of  the  random  finite  subset  E.  (This  is  essentially  equivalent  to  what, 
in  point  process  theory,  are  called  "Janossy  densities.")  The  quantity  f^(Z)  is  the  likelihood 
(i.e.,  probability  density)  that  the  event  E  =  Z  will  occur.  On  the  other  hand,  f^(Z)  also  has 
a  completely  Bayesian  interpretation.  Suppose  that  f^(Z  \X)  is  a  global  density  with  a  set 
parameter  X  =  where  are  the  unknown  (discrete  and  continuous) 

parameters  of  the  targets,  and  t  is  the  unknown  number  of  the  targets.  Then  j^(Z  |  Xj  is  the 
total  probability  density  of  association  between  the  measurements  in  Z  and  the  parameters  in 
X. 

The  global  density  of  a  sensor  suite  differs  from  conventional  densities  in  that  it  encapsulates 
the  comprehensive  statistical  behavior  of  the  entire  sensor  suite  into  a  single  mathematical 
object,  and  not  just  sensor  noise  statistics.  Most  generally  speaking,  a  global  density  has  the 
form  /j;(Z  I  X,  19  and  includes  the  following  information: 

•  the  observation-set  Z  = 

•  the  set  X  =  Cf;  of  unknown  parameters 

•  the  states  Y  =  {rjj  ......qj  of  the  sensors  (orientations,  modes,  etc.) 

•  the  sensor-noise  distributions  of  the  individual  sensors 

•  the  probabilities  of  detection  and  false  alarm  for  the  individual  sensors 
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•  clutter  models 

•  detection  profiles  for  the  individual  sensors  (as  functions  of  range,  aspect,  etc.) 

Global  density  functions  can  be  computed  explicitly  from  a  knowledge  of  the  sensors  that  belong 
to  the  sensor  suite.  For  example,  suppose  that  we  are  given  a  single  sensor  with  (conventional) 
sensor-noise  density  /(|  |  f)  with  no  false  alarms  and  constant  probability  of  detection  ,  and 
that  observations  are  independent.  Then  the  global  density  which  specifies  the  multitarget 
measurement  model  for  the  sensor  can  be  shown  to  be 

I ‘Pla-P^r^  E 


where  the  summation  is  taken  over  all  distinct  ij  ,...,4  such  that  1  <  ij  <  t.  (Here, 
are  assumed  distinct,  as  are  f/,...,f,.) 


Global  densities  satisfy  the  properties  that  one  would  expect.  For  example,  if  T(Z)  is  a 
measurable  vector-valued  transformation  of  a  finite-set  variable  then  the  expectation  E[T(]C)] 
of  the  random  vector  T(E)  is 

E[T(S)]  =  /T(Z)/j.(Z)6Z 

where  the  integral  is  a  set  integral.  Global  densities  also  enjoy  certain  unexpected  properties, 
the  most  interesting  of  which  is  the  fact  that  they  are  continuous  analogs  of  the  Mobius 
transform  of  Dempster-Shafer  theory.  That  is,  if  Z  =  /zi  ,...,/ J  with  |  Z  |  =  k  then  it  can 
be  shown  that 


.  Ey.z  (-1)'^'"' 


A(Z)  =  lim 


where  is  defined  by 


where  denotes  a  closed  ball  of  radius  1/i  centered  at  z  G  M". 

4-B-e.  A  Simple  Illustration.  Assume  that  two  identical  targets  located  on  the  real  line  are 
observed  by  a  single  Gaussian  sensor  with  Gaussian  density 

Aa\x)  =  N^Aa-x) 


with  variance  (f  and  associated  probability  measure  p^S\x}.  Assume  also  that  (1)  reports 
are  independent,  (2)  the  probability  of  false  alarm  for  the  sensor  is  zero,  (3)  the  probability  of 
detection  q  is  not  necessarily  unity,  and  (4)  that  q  is  constant  over  the  region  of  interest.  The 
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observations  corresponding  to  the  first  target  are  the  specific  realizations  of  a  Gaussian  random 
number  Aj  and  the  observations  corresponding  to  the  second  target  are  modeled  by  another 
Gaussian  random  number  A2  .If  q  =  1  then  the  observation  generated  by  the  first  target  is 
a  randomly-varying  single-element  set  of  the  form  =  {AJ  and  the  observation  generated  by 
the  second  target  is  E2  =  {AJ.  What  the  sensor  sees  at  any  given  instant  is  just  an  unordered 
two-element  set  Z  =  {dj  ,aj  of  reports  generated  by  the  randomly  varying  two-element  set 
E  =  Ey  U  E2  =  {Aj  .Aj.  If  q  <  1,  however,  then  the  possible  realizations  for  E  are  E  = 

Z  where  1)  Z  =  {dj  .dj  for  any  dj  ,d2  in  R  with  dj  7^  d2  2)  Z  =  {d}  for  any  d  in 

R;  or  3)  Z  =  0. 

The  statistics  of  this  random  set  are  described  by  the  belief  measure 

A(S\  X)  =  p(L^  S)  =  p(E,^S.  E2C5j 

where  X  is  a  set  parameter  which  can  take  the  form  X  =  0  (no  tracks),  X  =  {x}  (one 

track),  OT  X  =  {x 2  ,^2/  with  Xj  7^  X2  (two  tracks).  In  fact, 

A(S  1  {x})  =  1-q  +  qp/S  I  X) 

A(S  I  {Xi  ,X2})  =  [1-q  -i-  qPf(S  |  Xj)] [1-q  qp/S  \  X2)] 

From  the  belief  measure  it  is  possible,  using  the  set  derivative,  to  compute  the  global  density. 
In  the  one-track  case  X  =  [x}  it  is 

f/0  I  X)  =  1-q,  f({z}  I  X)  =  qf(z  I  X) 

and  in  the  two-track  case  X  =  (Xj  ,X2}  by 

f/0  I  X)  =  (1-q/ 

f({z}  I  X)  =  q(l-q)f(z  I  Xj)  q(l-q)f(z  |  X2) 

f({Zj  .Zj  I  X)  =  /f(Zi  I  X2)f(Z2  1  X2)  +  /f(Z2  I  Xj)f(Z2  I  X2) 

In  all  other  cases,  /j;(Z  \  X)  =  0.  The  one-track  case  is  just  the  conventional  single-sensor, 
single-target  situation  with  the  probability  of  detection  taken  into  explicit  account  ([47]  p.  15, 
[48]).  The  two-track  case  is  more  interesting.  Given  any  real  numbers  dj  <  a2  the  set  [dj 
’^2}  ~  {^2  ’^1}  does  not  depend  on  the  order  of  dj  ,d2  and  therefore  can  be  identified  with  the 
point  (dj  ,d[)  on  the  half-plane.  Likewise,  when  d^  -  d2  —  d  then  the  set  {d}  can  be 
identified  with  the  diagonal  boundary  line  x  =  y  of  the  half-plane.  In  other  words,  all  of  the 
possible  two-observation  realizations  Z  =  [dj  ,d2}  of  the  random  set  E  can  be  identified  with 
points  on  the  half-plane.  The  global  density  f/{dj  ,d2}  \  X)  gives  the  probability  density  that 
(dj  ,^2/  occurs  as  a  value  of  the  random  set  [Aj  ,A2}.  Its  graph  is  just  a  surface  over  the 
half-plane  which  has  a  peak  near  (but  usually  not  the  same  as)  the  point  (dj  ,d2).  The  reader 
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may  also  easily  verify  that  the  belief  measure  fi^(S  |  X)  can  be  recovered  from  the  global 
density  via  the  set  integral: 

;8j(S|X)  =  jj^{Z\X)hX 

=  /j(0|X)  +  //j(b)|X)&  *  V2l^J^Uz„z^}\X)dz,dz^ 
for  any  closed  5  S  r  and  any  finite  X  £  k. 

4-B-f.  The  Parallelism  Between  Point-  and  Finite-Set  Statistics.  Because  of  the  existence  of 
the  set  derivative  and  the  set  integral,  one  can  compile  the  following  list  of  direct  mathematical 
parallels  between  the  single-sensor,  single-target  random-vector  world  (conventional  statistics) 
and  the  multisensor,  multitarget  finite  random  set  world  ("finite-set  statistics").  These  parallels 
can  be  expressed  as  "translation  dictionary": 

Random  Vector,  Z 
sensor,  O 
target,  0 
report,  z 

vector  parameter,  x 

differentiation,  dp^/dz 
integration,  j  s/z(z  |  x)  dk(z) 
probability  measure,  p2,(S  \  x) 
density,  fz(z  |  x) 
prior  density,  7r(x) 

estimators 

information  (entropy)  metrics 
Markov  transition  densities 
nonlinear  filtering 
control  theory 

This  parallelism  between  the  information  fusion  and  single-sensor,  single-target  tracking  worlds 
means  that  general  statistical  methodologies  can,  with  a  bit  of  work,  be  directly  "translated" 
from  the  random-vector  case  to  the  random-set  case.  That  is,  any  theorem  or  mathematical 
algorithm  in  conventional  statistics  can  be  thought  of  as  a  "sentence"  in  a  language  whose 
"words"  and  "grammar"  consist  of  the  basic  concepts  in  the  left-hand  column  above.  The  above 
two  columns  can  be  thought  of  as  a  "dictionary"  which  establishes  a  direct  correspondence 
between  the  Words  and  grammar  in  the  random- vector  language  and  the  cognate  words  and 
grammar  of  the  random-set  language.  Consequently,  any  "sentence"— any  theorem  or 
mathematical  algorithm— phrased  in  the  random- vector  language  can,  in  principle,  be  directly 
"translated"  into  a  corresponding  sentence  (and  thus  corresponding  theorem  or  algorithm)  in  the 


Finite  Random  Set,  E 
global  sensor,  O* 
global  target,  0* 
global  set  report,  Z 
set  parameter,  X 

set  differentiation,  bfi^/bZ 
set  integration,  j  f^(Z\X)  bZ 
belief  measure,  fi^(S  \  X) 
global  density,  f^(Z  |  X) 
global  prior  density,  7r(X) 

global  estimators 
global  information  metrics 
global  Markov  transition  densities 
global  nonlinear  filtering 
global  control  theory 


18 


random-set  language.  This  process  of  direct  translation  can  be  encapsulated  as  a  general 
principle: 


The  "Almost-Parallel  Worlds  Principle"  (APWOP):  Nearly  every  theorem, 
method,  or  algorithm  for  single-sensor,  single-target  tracking  implies  an 
analogous  theorem,  method,  or  algorithm  in  multi-sensor,  multi-target  data  fusion 


We  say  "almost-parallel"  because,  as  with  any  translation  process,  the  correspondence  between 
dictionaries  is  not  precisely  one-to-one.  For  example,  there  apparently  is  no  natural  way  to  add 
and  subtract  finite  sets  as  one  does  vectors.  Nevertheless,  the  parallelism  is  complete  enough 
that,  provided  one  exercises  some  care,  a  hundred  years  of  accumulated  knowledge  in 
conventional  (i.e.,  single-sensor,  single-target)  statistics  can  be  directly  brought  to  bear  on 
multisensor,  multitarget  information  fusion  problems.  In  particular,  we  offer  the  following 
specific  examples  of  the  potential  utility  and  power  of  the  "almost-parallel  worlds  principle. " 


a.  Data  Fusion  Information  Metrics.  Suppose  that  we  wish  to  attack  the  problem  of 
performance  evaluation  of  information  fusion  algorithms  in  a  scientifically  defensible  manner. 
In  the  proposal  which  led  to  this  contract,  we  defined  a  "global"  version  of  the  conventional 
Kullback-Leibler  discrimination  (or  cross-entropy)  metric  which  is  applicable  to  performance 
evaluation  problems  in  multisensor,  multitarget  data  fusion: 


/(4;/r) 


/  Aro  In 


'AW' 

i/rWj 


6X 


where  the  integral  is  a  set  integral.  This  information  metric  is  the  basis  for  a  systematic, 
information-theory  based  approach  to  multisource,  multitarget  data  fusion  system  evaluation. 
Using  such  metrics,  one  can  measure  the  overall  information  produced  by  a  data  fusion  system, 
or  the  specific  "components"  of  information  attributable  to  such  special  fimctions  as  target  I.D., 
target  localization,  or  target  detection.  In  addition,  a  well-known  challenge  in  performance 
measurement  is  the  fact  that  one  end-user’s  "information"  is  another  end-user’s  "confusion. " 
Accordingly,  information-theoiy  based  approaches  to  performance  evaluation  will  not  prove 
adequate  unless  ambiguous  or  even  subjective  user-specified  definitions  of  information  can  be 
modeled  and  computed.  Under  the  current  contract,  it  was  shown  that  this  is  indeed  possible. 
(See  section  5.3  of  Chapter  5  of  [12],  and  section  8.1  of  Chapter  8  of  [12]  for  more  details.) 

b.  Multisensor,  Multitarget  Decision  Theory.  The  basic  elements  of  decision  theory  can  be 
directly  generalized  to  the  multisource,  multitarget  case.  For  example,  the  Receiver  Operating 
Characteristic  (ROC)  curve  of  an  entire  multisensor,  multitarget  system  can  be  defined  as  the 
parametrized  curve  t ->  ^pJt),Po(t))  where 
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/  A. . e.i«.(Zi.-.Z.I*i)6Zi-6Z. 

Pd(^)  =  1-  /  A . j.lff.(Z..-.Zj«o)«Z,-»Z. 

. 


and  where 


is  the  "global"  likelihood  ratio  for  the  problem  of  deciding  between  hypotheses  Hg  and  Hj 
given  global  observations  Z,  ,...,Z„.  (See  [28]  for  more  details.) 

c.  Multisensor,  Multitarget  Estimation.  One  can  define  multisensor,  multitarget  analogs  of 
conventional  statistical  estimators,  as  well  as  the  concept  of  a  Bayes-optimal 
multisensor,multitarget  estimator.  Each  such  estimator  is  actually  a  different  possible  optimal 
multisensor,  multitarget  Level  1  data  fusion  algorithm. 

Specifically,  a  global  estimator  of  the  set  parameter  Z  of  a  global  density  f^(Z\X)  is  a  finite- 
set- valued  function  X  =  /(Z;  of  the  global  measurements  Zj  ,...,Z„, .  One  can  define 

a  multisensor,  multitarget  version  of  the  maximum  likelihood  estimator  (MLE)  by 

Jgmle(Zi =  arg  supxL(X\  Zj  ,...,ZJ 

where  L(X  |  Zj  ,...,ZJ  =  f^(Zi  \  X)  •  •  •  /^fZ^  |  X)  is  the  "global"  likelihood  function.  That 
is,  one  determines  that  value  of  t  and  those  values  of  which  maximize  the  value  of 

the  likelihood  function  L({^j  \  Zj  ,...,ZJ. 

Notice  that  in  determining  X,  one  is  estimating  not  only  the  geokinematics  and  identities  of 
targets,  but  also  their  number  as  well.  Thus  detection,  localization,  and  identification  are  unified 
into  a  single  statistical  operation.  This  operation  is  a  direct  multisource,  multitarget  estimation 
technique  in  the  sense  that  the  unknown  states  of  the  unknown  number  of  targets  are  determined 
directy  from  data,  without  first  attempting  to  compute  an  optimal  report-to-track  assignment. 
The  definition  of  a  multisensor,  multitarget  analog  of  the  MAP  estimator  is  less  straightforward 
but  also  possible.  See  section  5.2  of  Chapter  5  of  [12]  for  more  details. 

d.  Cramer-Rao  Inequalities.  One  can  show,  under  assumptions  analogous  to  those  used  in  the 
proof  of  the  conventional  Cramer-Rao  inequality,  that  the  following  holds  for  a  vector-valued 
multisensor,  multitarget  estimator  J: 
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<v,Cjx(v))  •  (w,I^^^(w))  ^  <v,-^Ex[X]) 


for  all  v,w  where  X  =  JfE;  and  where  is  defined  by 

“  ^X| 

for  all  v,w,  where 


Y  a  In/] 

f 

a  V 

L\  *  / 

i  3.'“  j. 

/  =  A 


2 1 ^  I  r 


and  where  the  directional  derivative  df/d^y  of  the  function  f(X)  of  a  finite-set  variable  X,  if 
it  exists,  is  defined  by 

-^(X)  .  /((X-(x))u|x4-ev))-/(X) 

SjV  e^o  e 

^(X)  =  0  (ifxCX) 

The  existence  of  this  multisource,  multitarget  Cramer-Rao  inequality  opens  the  possibility  of 
determining  best-possible-theoretical  performance  bounds  for  specific  multisource,  multitarget 
data  fusion  problems.  See  section  5.3  of  Chapter  5  of  [12]  for  more  details. 

e.  Multisensor,  Multitarget  Nonlinear  Filtering.  Moving  targets  can  be  accounted  for  through 
suitable  generalization  of  standard  Bayes-Markov  filtering  techniques.  Let  be  a  time- 

sequence  of  global  observation-sets  collected  by  the  global  sensor  and  abbreviate  the  list  Zj 
,...,Z„  as  Let  /(Z„  |  XJ  be  the  global  density  which  describes  the  behavior  of  the  global 
sensor.  Under  Markov  assumptions,  the  dynamic  time-evolution  of  the  entire  multitarget  system 
between  measurements  is  described  by  a  global  Markov  transition  density  |  XJ. 

Note  that,  in  general,  the  finite  parameter-sets  and  X„  may  have  differing  numbers  of 

elements.  This  fact  allows  for  our  global  motion  model  to  model  variation  in  the  number  of 
targets  due  to  appearance  and  disappearance  of  targets. 

Lockheed  Martin  has  shown  that  it  is  possible  to  contruct  global  transition  densities 
/a+7 1  ccO^a+i  I  XJ  from  the  motion  models  |  fj  of  individual  targets  in  such  a  way 

that  the  global-target  state  transition  is  completely  consistent  with  the  state-transitions  of  the 
individual  targets.  See  section  6.4.3  of  Chapter  6  of  [12].  As  a  special  case,  assume  that  the 
number  of  targets  always  remains  constant  over  time,  then  a  typical  global  transition  density  will 
have  the  form 
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I. «, I  c„) 


for  any  t  =  0,1,2....  where  fa+i\a  is  a  conventional  Markov  transition  density  and  a  is  a 
permutation  on  the  numbers  l,...,t.  The  global  transition  density  formulation  is  general  enough 
in  form  that  it  can  accommodate  different  motion  models  for  different  targets.  The  reason  is 
that,  if  =  (x^^j,u)  and  =  (x^  ,u)  then  |  U  =L+i\c,uK^i  I  xj  where 

fa+i  I  Ku  denotes  a  Markov  transition  density  associated  with  the  target  or  target  type  identified 
by  u.) 


It  follows  that  the  concept  of  Markov  time-prediction  can  be  directly  generalized  to  global 
densities: 


where  the  integral  is  a  set  integral  and  where  C/“  and  u„  are  control  inputs.  Likewise,  the 
discrete-time  Bayesian  nonlinear  filtering  equation  can  be  directly  generalized: 


/ sy..! 


Estimates  of  the  time-evolving  state  can  be  computed  using  using  estimation  approaches 
(e.g.,  MAP  estimation:  determining  which  value  of  the  set  parameter  maximizes  the  global 
posterior  (in  a  specific  sense  not  discussed  here)  at  any  given  time-instant).  For  more  details, 
see  section  6.4  of  Chapter  6  of  [12]. 

/.  Sensor  Management:  Multisensor  Nonlinear  Control  Theory.  Recall  that  sensor  management 
is  the  problem  of  controlling  the  re-directable  and/or  multimode  sensors  in  a  sensor  suite  in 
order  to  resolve  ambiguities  in  our  knowledge  about  multiple  targets.  The  parallelism  between 
point-variate  and  finite-set-variate  statistics  suggests  that  one  way  of  attacking  this  problem  is 
by  first  examining  the  solution  methodology  for  the  single-sensor,  single-target  case:  E.g.,  a 
missile-tracking  camera  as  it  attempts  to  follow  a  missile.  The  camera  must  adjust  its  azimuth, 
elevation,  and  focal  length  in  such  a  way  as  to  anticipate  the  location  of  the  missile  at  the  time 
the  next  image  of  the  missile  is  to  be  recorded.  This  is  a  standard  problem  in  optimal  control 
theory.  Such  problems  are  solved  by  defining  a  controlled  vector,  associated  with  the  camera, 
and  a  reference  vector,  associated  with  the  target,  and  attempting  to  keep  the  distance  between 
these  two  vectors  as  small  as  possible. 


An  approach  to  the  multisensor,  multitarget  sensor  management  problem  becomes  evident  if  we 
use  the  "almost-parallel  worlds  principle"  to  reformulate  such  problems  as  a  single-sensor, 
single-target  problem.  In  this  case  the  "global"  sensor  follows  a  "global"  target  (even  though 
some  of  whose  individual  targets  may  not  even  be  detected  yet).  The  motion  of  the  multitarget 
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system  is  modeled  using  a  global  Markov  transition  density.  The  only  undetermined  aspect  of 
the  problem  is  how  to  define  analogs  of  the  controlled  and  reference  vectors.  This  is  done  by 
determining  the  Kullback-Leibler  information  distance  between  two  suitable  global  densities. 

In  more  detail,  suppose  that  the  global  sensor  is  "following"  the  "trajectory"  of  the  global  target, 
in  some  sense  whose  meaning  we  wish  to  determine.  Suppose  that  over  a  given  time  period  the 
global  sensor  collects  the  time-succession  of  global  observations  Z;  of  the  time- 

succession  of  global  states  Xj  of  the  global  target,  and  that  the  control  sensors  collect 

a  series  z*;  ,...,z*„  of  global  measurements  of  the  states  x*j  of  the  global  sensor 

(i.e.,  the  states  of  the  individual  sensors).  Finally,  let  ,...,u„  be  the  sequence  of  global 
control  input  vectors  that  causes  the  global  sensor  to  "follow"'  the  global  target.  Consider  a 
fixed  time-instant  a.  At  that  time-instant,  our  current  knowledge  of  the  state  of  the  global 
target  is  completely  described  by  the  current  global  posterior  distribution: 


Integrating  out  the  control  state  x*„  yields  the  global  marginal  posterior  which  describes  the 
statistics  of  the  state  of  the  global  target  state: 

I  Z;,...,Z„,z*;,...,z*^,U;  ,...,u„J 

Also,  note  that  by  using  the  global  Markov  transition  density  for  the  global  target  alone, 

fa+I  I  aO^a  +  l  I 

this  information  can  be  extrapolated  to  time-instant  a+1: 


Ba+l  I  J^cc+I  I  cc(^a+l  I  j  >  •  •  •  ct  >  j  •  •  • 

This  global  density  comprehensively  describes  the  state  of  our  information  concerning  the  global 
target  at  time  instant  a+1,  eonditioned  on  the  past  measurement-  and  control-history  up  to 
time-instant  a.  On  the  other  hand,  the  global  Markov  transition  density  for  the  entire  system 

fa+l  I  a(^a+l  a+1  I  ^a  a 

can  be  used  to  time-update  the  global  posterior  at  time-instant  o:  to  time-instant  a+1: 

fa+l  I  aO^a+1  '  ^  a  +  I  I  ’  •  •  • ’’^a  » '  •  •  *  »  ^1/  ,  • . .  jU^j  , 

which,  again  computing  the  marginal  distribution,  leads  to: 

Sa  +  l  \  a^^a+J  I  fa+l  |  a^^a+1  I  ^1  f  '  * '  >^a  7  ,  •  •  •  ,2!  ^  ,  •  •  • 

Finally,  suppose  that  an  additional  observation  Z„+y  of  the  global  target  is  collected.  Then  we 
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can  compute  the  updated  global  marginal  posterior 

Sa+l  I  a+lO^a+I  I  ^a+J  I  a+1  (^a+I  I  Zj  jZ^^j , 

This  global  density  comprehensively  describes  the  state  of  our  information  concerning  the  global 
target  at  time-instant  a-f-i: 

1)  conditioned  on  the  past  measurement-  and  control-history  up  to  time-instant  a; 

2)  conditioned  on  the  control  variable  u„  (whose  value  has  yet  to  be  determined);  and 

3)  based  on  the  new  global  measurement  of  the  state  of  the  global  target. 

If  additional  information  about  the  state  of  the  global  target  has  become  available  at  time-instant 
a+1,  therefore,  it  must  be  the  result  of  two  things: 

•  the  additional  information  contained  in  the  global  observation  ,  and 

•  the  additional  information  resulting  from  a  "good"  choice  of  the  global  control  vector 

We  are  interested  only  in  the  latter,  since  we  wish  to  determine  a  value  of  the  global  control 
vector  u„  which  will  have  the  global  sensor  "pointing"  at  the  global  target  as  optimally  as 
possible,  even  before  we  collect  a  new  observation.  If  we  set  =  0  (nothing  at  all  is 
observed  at  time-instant  a+1)  then  the  global  density 

gc+i\  a+1  (X’a+i  I  0,Uj 

is  a  measure  of  the  information  at  time-instant  a-l-i  that  is  due  to  "good"  sensor  allocation  via 
the  control-action  alone,  independently  of  any  further  information  that  might  result  from 
additional  measurements.  Accordingly,  the  global  control  u„  has  been  chosen  "optimally"  if 
the  information  contained  in 

I  0,Uj 

due  to  the  latest  control,  is  as  large  as  possible  compared  to  the  information  contained  in 


ga+2  \ 

which  is  due  to  past  history  up  to  time  a  alone.  Or,  stated  in  other  terms:  The  increase  in 
entropy  (i.e.,  decrease  in  information)  due  to  the  control  should  be  as  small  as  possible.  From 
this  point  of  view,  the  density 


I  a(^a+l) 
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reports 


.^"cookie  cutter" 
evidence 
constrains 
possible  locations 
of  reports 


Figure  1:  The  statement  "target  1  is  NEAR  location  A  and  target  2  is  NEAR 
location  B"  is  illustrated  when  the  concept  NEAR  is  interpreted  as  constraining 
observations  within  "cookie  cutter"  regions  Gj  and  Gj  of  locations  A,B. 


ambiguous 
evidence 
in  random 
set  form 


Figure  2:  The  statement  "target  1  is  NEAR  location  A  and  target  2  is  NEAR 
location  B"  illustrated  when  the  concept  NEAR  is  interpreted  as  a  variable 
constraint  by  random  subsets  Qj  2  ■ 


25 


is  an  analog  of  a  target  "reference  variable"  and  the  density 

Ba+l I a+1 

is  an  analog  of  a  sensor  "controlled  variable,"  with  entropy  being  the  measure  of  "closeness" 
between  the  two.  Accordingly,  we  need  a  metric  which  measures  increases  in  information  (as 
represented  in  the  form  of  global  densities).  The  global  Kullback-Leibler  information  metric  has 
already  been  defined  as 

KO-.g)  =  //(X)ln|^j6X 


Based  on  the  preceding  considerations,  we  define  the  following  optimality  criterion  for 
multisensor,  multitarget  optimal  target-tracking  control:  Let  fa+i\a(^a+j)  and 

/«+/  l«+/(^c<+i  I  be  as  defined  previously.  Then  we  define  the  following  cost  function: 


»•••>  tij^)  — 


E  +  u/Pu„ 

a  =0 


We  say  that  a  "optimal  target-tracker  control  law"  for  the  multisensor,  multitarget  allocation 
problem  is  a  sequence  of  control  inputs  which  minimizes  this  cost  function.  (For 

more  details  see  [32]). 


g.  Data  Fusion  Using  Ambiguous  Evidence.  In  section  2.3  of  Chapter  2  of  [12]  it  is  shown  that 
many  kinds  of  ambiguous  evidence  can  be  modeled  as  random  sets.  It  is  possible  to  show,  as 
is  done  in  Chapter  7  of  [12],  that  the  two  sides  of  data  fusion— multisensor,  multitarget 
estimation  on  the  one  hand  and  expert-systems  theory  on  the  other— can  be  fully  integrated. 

Specifically,  one  begins  by  modeling  ambiguous  observations  as  discrete  closed  random  subsets 
of  observation  space  and  then  specifying  measurement  models  for  ambiguous  evidence.  Suppose 
that  we  have  a  statement  such  as 


"target  1  is  NEAR  location  A  and  target  2  is  NEAR  location  B” 

Suppose  that  NEAR  in  these  two  cases  can  be  interpreted  to  mean  that  target  1  (and  therefore 
observations  of  target  1)  will  always  be  found  in  "cookie  cutter"  region  G;  and,  likewise,  that 
observations  of  target  2  will  always  be  found  in  "cookie  cutter"  region  G2.  Then  Figure  1 
illustrates  the  constraint  that  is  imposed  by  this  evidence.  In  general,  however,  evidence  will 
consist  not  of  a  simple  cookie-cutter  constraint  G;  but  rather  of  a  range  of  constraints  G/^-*  £ 
...  £  G/'^  ,  each  being  an  interpretation  of  the  concept  NEAR  to  some  degrees  rj  ,...,r^  of 
likelihood.  This  evidence  can  be  modeled  as  a  random  subset  6j  of  observation  space  such  that 
p(6j  =  G/'^)  =  r,  for  all  i  =  l,...,d,  as  illustrated  in  Figure  2. 

Measurement  models  for  ambiguous  observations  consist  of  global  measurement  densities  which 
have  the  form 
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f(Z]  ,6]  I  X) 


where  Zj  denote  precise  global  observations,  and  where  6j  ,...,6, are  discrete  random 

closed  subsets  of  observation  space  which  model  ambiguous  evidence.  From  these  measurement 
models  one  can  then  define  global  posterior  densities  conditioned  on  both  data  and  evidence: 

f(X  I  Zj  ,...,Z^,6j 6,„') 


Then  one  can  derive  recursive  update  equations  for  both  data  and  evidence.  For  example, 
applying  one  particular  measurement  model  and  suitable  independence  assumptions,  we  get 


f(x\Zi.....z„.e,,....ej 

f(X  I  Z;  ,...,Z,„  ,6 J 


“  f(Z„\X)f(X\ 

z,... 

-.ej 

0=  \x)f(X\ 

|Z,,. 

where  fi(6  \  X)  describes  the  influence  of  evidence  d.  It  thereby  becomes  possible  to  extend 
the  Bayesian  nonlinear  filtering  equations  so  that  both  precise  and  ambiguous  observations  can 
be  accommodated  into  dynamic  multisensor,  multitarget  estimation.  For  example,  if  one  assumes 
one  possible  measurement  model  for  evidence,  one  gets  the  following  update  equation: 


•'a+lla+l^  a+1  I  ^  ^  i  /  \  /  \ 

/  m.,  I  y..,)  I .  (y..,  I  Z'“> .  e<“>)  67.., 


h.  A  Bayesian  Characterization  of  Rules  of  Evidential  Combination.  If  one  specific 
measurement  model  for  ambiguous  evidence  is  assumed— the  so-called  "data-dependent"  model 
(see  Chapter  7  of  [12])— then  posterior  distributions  will  have  the  property  that 


/(X I ,...,  0^,)  -  I  Zj Z^ ,  n ...  n  0^,) 


In  other  words,  the  random-set  intersection  operator  ’  fi  ’  may  be  interpreted  as  a  means  of 
fusing  multiple  pieces  of  ambiguous  evidence  in  such  a  way  that  posteriors  conditioned  on  the 
fused  evidence  are  identical  to  posteriors  conditioned  on  the  individual  evidence  and  computed 
using  Bayes’  rule  alone.  Thus,  for  example,  suppose  that 

=  {u^U\  A  <f(u)} 

O2  =  '^a(8)  =  {u^U\  A  ^  g(u)} 

where  fg  are  two  fuzzy  membership  functions  on  U  and  ^  is  a  uniformly  distributed 
random  number  on  [0,1].  Let  denote  the  Zadeh  "min"  fuzzy /4A®  operation  and  the 
posteriors  fr\^(X\  Z,f  Kg)  and  /r  |  |  Z,fg)  by 
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/P|,(Z|Z,/A  g)  =  /(XlE^tfAg) 


We  know  from  section  2.3.2  of  Chapter  2  of  [12]  that 

n  LJg)  =  A  g) 

and  thus  that 

fr\T(X\ZJ,g)  =  f(X\Z.E^(f),EJg))  =  f(X\Z,L^0nEJg)) 
=  f(X  (  Z,E^(fAg))  =  f(X  I  Z,fAg) 

and  so 


fr\^(X\Z,f,g)  =  f(X\Z,fAg) 

That  is:  The  fuzzy  AND  is  a  means  of  fusing  fuzzy  evidence  in  such  a  way  that  posteriors 
conditioned  on  the  fused  evidence  are  identical  to  posteriors  conditioned  on  the  fuzzy  evidence 
individually  and  computed  using  Bayes’  rule  alone.  Thus  fuzzy  logic  is  entirely  consistent  with 
Bayesian  probability-provided  that  it  is  first  represented  in  random  set  form,  and  provided  that 
we  use  a  specific  measurement  model  for  ambiguous  observations.  Similar  observations  apply 
to  any  rule  of  evidential  combination  which  bears  a  homomorphic  relationship  with  the  random 
set  intersection  operator. 

4-B-g.  Possible  Objections  to  "Finite-Set  Statistics".  A  number  of  objections  to  the  use  of 
random  set  theory  in  data  fusion  can  be  anticipated.  We  address  some  of  these  in  turn. 

Isn’t  It  Just  "Bookkeeping”?  The  reformulation  of  multisource,  multitarget  problems  as 
single-sensor,  single-target  problems  is  not  just  a  mathematical  "bookkeeping"  device.  Generally 
speaking,  any  group  of  targets  observed  by  imperfect  sensors  must  be  analyzed  as  a  single 
indivisible  entity  rather  than  as  a  collection  of  unrelated  individuals.  When  measurement 
uncertainties  are  large  in  comparison  to  target  separations  there  will  always  be  a  significant 
likelihood  that  any  given  measurement  was  generated  by  any  given  target.  This  means  that  every 
measurement  can  be  associated,  partially  or  in  some  degree  of  proportion,  to  every  target.  The 
more  irresolvable  the  targets  are,  the  more  our  estimates  of  them  will  be  statistically  correlated 
and  thus  the  more  that  they  will  seem  as  though  they  are  a  single  target.  Observations  can  no 
longer  be  regarded  as  separate  entities  generated  by  individual  targets  but  rather  as  collective 
phenomena  generated  by  the  entire  multitarget  system.  This  remains  true  even  when  target 
separations  are  large  in  comparison  to  sensor  uncertainties.  Though  in  this  case  the  likelihood 
is  very  small  that  a  given  observation  was  generated  by  any  other  target  than  the  one  it  is 
intuitively  associated  with,  nevertheless  this  likelihood  is  nonvanishing.  Thus  the  intuitive  "this 
observation  goes  with  that  track"  perspective  is  only  an  ideal  limiting  case  of  the  more  general 
multitarget  problem.  The  random  set  approach  inherently  models— and  forces  one  to  take 
account  of— the  organic/collective  nature  of  multitarget  systems. 
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Why  Not  Vectors  or  Point  Processes?  Other  skeptics  might  ask:  Why  not  simply  use  vector 
models  or  point  process  modelsl  It  could  be  objected,  for  example,  that  what  we  have  called 
the  "global  density"  f^(Z)  of  a  finite  random  subset  L  is  just  a  new  name  and  notation  for  the 
Janossy  densities  jjzi  n>  0  [5],  pp.  122-123)  of  the  corresponding  simple  finite  point 

process  defined  by  Nj^(S)  =  |  E  O  5  (  for  all  measurable  S.  (Point  processes  have  been 
investigated  as  a  basis  for  multitarget  tracking  in  [59,  65],  for  example.)  In  response  to  possible 
such  objections  we  offer  the  following  responses. 

First,  vector  approaches  encourage  carelessness  in  regard  to  basic  questions.  For  example,  to 
apply  the  theorems  of  conventional  estimation  theory  one  must  clearly  identify  a  measurement 
space  and  a  state  space  and  specify  their  topological  and  metrical  properties.  Wald’s  proof  of 
the  consistency  of  the  maximal  likelihood  and  maximum  a  posteriori  estimators,  for  example, 
assumes  that  state  space  is  a  metric  space  satisfying  certain  properties.  As  another  instance 
suppose  that  we  want  to  determine  whether  small  deviations  in  the  input  data  to  a  data  fusion 
algorithm  can  result  in  large  deviations  in  output  data.  To  answer  this  question  one  must  first 
have  some  idea  of  what  distance  means  in  both  measurement  space  and  state  space.  The  standard 
Euclidean  metric  is  clearly  not  adequate:  If  we  represent  an  observation  set  {z^  ,zj  as  a  vector 
(Zj  ,Z2)  then  ||  (Zj  ,Z2)  -  (Z2  ,Zj)  0  even  though  the  order  of  measurements  should  not 
matter.  Likewise  one  might  ask.  What  is  the  distance  between  (z^  ,z.^  and  z{l  Whereas  both 
finite  set  theory  and  point  process  theory  have  rigorous  metrical  concepts,  attempts  to  define 
metrics  for  vector  models  can  quickly  degenerate  into  ad  hoc  invention. 

More  generally,  the  use  of  vector  models  has  resulted  in  piecemeal  solutions  to  information 
fusion  problems  (most  typically,  the  assumption  that  the  number  of  targets  is  known  a  priori). 
Lastly,  any  attempt  to  incorporate  expert  systems  theory  into  the  vector  approach  results  in 
extremely  awkward  attempts  to  make  vectors  behave  as  though  they  were  finite  sets. 

Second,  the  random  set  approach  is  explicitly  geometric  in  that  the  random  variates  in  question 
are  actual  sets  of  observations— rather  than,  say,  abstract  integer-valued  measures. 

Third,  systematic  adherence  to  a  random  set  perspective  results  in  a  series  of  direct  parallels 
between  single-sensor,  single-target  statistics  and  multisensor,  multitarget  statistics  which  results 
in  a  methodology  for  information  fusion  that  is  nearly  identical  in  general  behavior  to  the 
"Statistics  101"  formalism  with  which  engineering  practitioners  and  theorists  are  already 
familiar.  (The  elements  of  the  random  set  approach  are  simple  enough  that  they  could  be  taught 
at  the  junior  and  senior  undergraduate  levels.  Random  measure  theory,  by  way  of  contrast,  is 
unlikely  to  ever  find  its  way  into  the  undergraduate  curriculum  even  in  mathematics 
departments.)  More  importantly,  it  leads  to  a  systematic  approach  to  solving  information  fusion 
problems  that  allows  standard  single-sensor,  single-target  statistical  techniques  to  be  directly 
generalized  to  the  multisensor,  multitarget  case. 

Fourth,  because  the  random  set  approach  provides  a  systematic  foundation  for  both  expert 
systems  theory  and  multisensor,  multitarget  estimation  (see  section  2.5.6  of  Chapter  2  of  [12]), 
it  permits  a  systematic  and  mathematically  rigorous  integration  of  these  two  quite  different 
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aspects  of  information  fusion— a  question  left  unaddressed  by  either  the  vector  or  point-process 
models. 

Fifth,  an  analogous  situation  holds  in  the  case  of  random  subsets  of  which  are  convex  and 
bounded.  Given  a  bounded  convex  subset  T  ^  the  support  function  of  T  is  defined  by 

Sj(t)  =  sup,,  g  T  (e,x) 

for  all  vectors  e  on  the  unit  hypersphere  in  R",  where  denotes  the  inner  product  on  1". 
The  assignment  E  -»■  establishes  a  very  faithful  embedding  of  random  bounded  convex  sets 
E  into  the  random  functions  on  the  unit  hypersphere,  in  the  sense  that  it  encodes  the  behavior 
of  bounded  convex  sets  into  vector  mathematics  [27,  46].  Nevertheless,  it  does  not  follow  that 
the  theory  of  random  bounded  convex  subsets  is  a  special  case  of  random  function  theory. 
Rather,  random  functions  provide  a  useful  tool  for  studying  the  behavior  of  random  bounded 
convex  sets.  In  like  manner,  finite  point  processes  are  best  understood  as  specific— and  by  no 
means  the  only  or  the  most  useful-representations  of  random  finite  subsets  as  elements  of  some 
abstract  vector  space. 

4-B-h.  Computational  techniques  analysis.  A  survey  of  potentially  applicable  computational 
and  approximation  techniques  was  accomplished  under  the  current  contract.  The  techniques  thus 
far  identified  as  suitable  for  further  investigation  are  as  follows. 

Approximate  Computation  of  Permanents  of  Matrices.  The  most  numerically  intensive 
computation  involved  in  random  set  formulations  of  multitarget  data  fusion  problems  consists 
of  evaluating  or  approximating  combinatorial  sums  of  the  form 

Y^^^Qperm(Q) 

where  Q  =  is  an  arbitrary  real-valued  matrix,  where  the  summation  is  taken 

over  all  square  submatrices  A  =  ^  of  Q,  and  where 

perm(A)  =  •  •  •  4 

is  the  permanent  [2]  of  A  (summation  taken  over  all  permutations  a  on  the  numbers  l,...,e). 
The  reason  for  this  is  as  follows.  Let  Zj  ,...,z„  be  a  scan  of  observations,  let  Xj  ,...,Xt  be  a 
list  of  target  parameters  with  m  <  n,  and  define 

Qi.j  =  /fZi  I 

with  i  =  l,...,m;  j  =  l,...,n.  Then,  assuming  independence  of  measurements,  exact 
computation  of  global  densities  requires  computation  of  all  permanents  of  all  square  submatrices 
of  matrices  of  the  form  Q. 

Permanents  can  be  computed  exactly  with  computational  complexity  T  [50].  Permanents  can 
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also  be  approximated  in  a  far  more  computationally  efficient  manner,  however,  because  of  the 
fact  that  they  share  some  of  the  same  properties  as  determinants.  For  example,  if  Q  can  be 
written  as  a  block-diagonal  matrix  whose  blocks  are  the  square  submatrices  <2;  >  ->Qe  then 
perm(Q)  =  perm(Qj)  •  •  'perm(QJ.  Also,  permanents  can  be  expanded  by  rows  or  by  columns, 
and  are  unchanged  by  interchanges  of  rows  or  interchanges  of  columns.  Approximate 
computation  of  permanents  is  achieved  by  "sparsifying"  confusion  matrices-i.e.,  by  zeroing  out 
any  entries  which  are  excessively  small  compared  to  other  entries  (that  is,  any  low-likelihood 
associations  are  ignored).  A  number  of  other  methods  for  approximating  the  values  of 
permanents  have  been  investigated  [23,  54]. 

Approximation  by  Gaussian  Sums.  The  Gaussian  sum  approach  for  approximating  time- 
propagated  general  Bayesian  posterior  distributions  was  introduced  over  twenty  years  ago  by 
Aspach  and  Sorenson  [62,  63].  The  basic  idea  is  to  approximate  the  various  densities  involved 
in  Bayesian  recursive  nonlinear  filtering— the  prior,  state-transition,  and  measurement-model 
densities-as  Gaussian  mixtures  (finite  sums  of  Gaussian  distributions).  Gaussian  sums  are 
closed  under  the  recursive  filtering  operations  (integration  and  time  prediction;  division  and 
Bayesian  update).  As  a  result,  the  time-evolved  posterior  densities  are  also  Gaussian  sums, 
though  the  number  of  terms  in  these  sums  increases  exponentially  with  time  and  thus  pruning 
of  terms  is  necessary. 

If  the  underlying  sensors  and  target  motion  models  are  Gaussian,  it  is  easy  to  demonstrate  that 
the  corresponding  multitarget  posterior  distributions  (global  posteriors)  are  themselves  Gaussian 
sums.  Accordingly,  in  the  purely  Gaussian  case  the  multisensor,  multitarget  problem  is  naturally 
suited  for  application  of  Gaussian  sum  approximation  techniques.  More  generally,  suppose  that 
the  prior,  measurement-model,  and  motion-model  densities  for  individual  sensors  and  targets, 
respectively,  are  approximated  as  Gaussian  sums.  Then  the  corresponding  global  densities  are 
also  approximated  as  Gaussian  sums.  Since  the  global  recursive  nonlinear  filtering  equations 
(discussed  earlier)  are  closed  with  respect  to  Gaussian  sums,  it  follows  that  approximation 
techniques  of  this  kind  may  prove  useful  in  more  general  data  fusion  problems  as  well. 

Asymptotic  Approximations  of  Integrals.  The  simplest  of  asymptotic  approximations  is  the 
saddle-point  approximation  (also  known  as  Lagrange’s  method  of  integration).  It  is  a  technique 
for  asymptotically  approximating  integrals  of  suitably  well-behaved  functions  ([55],  pp.  88-90 
and  [20],  pp.  30-37).  The  simplest  and  most  widely  known  application  of  the  method  is  to 
deriving  Sterling’s  approximation  formula  for  factorials.  That  is,  from  the  general  formula 

ea 

z!  =  1“  t^e-^dt  =  j 

0 

for  the  factorial  function,  one  notes  that,  as  a  function  of  t,  the  integrand  is  largest  when 
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—  (zln?  -  f) 
3^ 


0 


i.e.  at  the  saddle  point  t  =  z.  Expanding  z  In  t  - 1  as  a  function  of  t  in  a  Taylor’s  series 
about  t  =  z  leads  to 

zlnr  -  t  =  (zlnz  -  z)  +  O(r-z)  — ^(f-z)^  +  ... 

2z 


and  thus 
z! 


«  e 


J_ 

2z 


it-zf 


dt 


e~^z^  V^Tcz 


since  the  saddle  point  value  will  dominate  the  integral  for  large  z. 

Lagrange’s  method  has  been  employed  in  statistics  since  the  1960s  to  approximate  the  values  of 
various  statistical  quantities  defined  in  terms  of  definite  integrals  (see  [6,  58]),  assuming  a  large 
number  of  data  samples.  It  would  be  desirable  to  determine  whether  or  not  Lagrange’s  method 
can  be  generalized  to  approximate  quantities  defined  in  terms  of  set  integrals.  A  broader  class 
of  asymptotic  approximation  methods  known  as  large-deviation  techniques  (see  [3,  7])  are  also 
of  interest. 

Computational  Statistical  Mechanics.  Set  integrals  occur  regularly  in  the  statistical  theory  of 
gases  and  liquids  (see  [19],  pp.  234,  Equ.  37.4;  p.  266,  Equ.  40.28)  though  they  are  never 
explicitly  identified  as  such  in  that  context.  This  should  not  be  surprising  since  there  are  many 
formal  mathematical  similarities  between  multitarget  systems  with  multiple  target  types,  on  the 
one  hand,  and  ensembles  of  molecules  belonging  to  multiple  molecular  species,  on  the  other. 
Because  of  this,  there  is  reason  to  believe  that  the  physics  community  has  developed  approximate 
computational  techniques  which  might  be  applicable  to  multisource,  multitarget  information 
fusion  problems  (see,  for  example  [16]). 

Lockheed  Martin  has  already  had  some  success  [24]  with  one  such  statistical  approximation 
technique  known  in  the  physics  community  as  "mean-field  approximation"  [52].  This  approach 
provides  a  means  of  approximating  combinatorial  sums  by  first  approximating  them  as  integrals 
and  then  applying  Laplace’s  integration  method.  To  illustrate  the  approximation  of  a  relatively 
simple  combinatorial  sum  using  this  technique,  we  apply  it  to  the  problem  of  computing  the 
permanent  of  an  n  X  n  matrix  Q  =  {Qj  Jjk^j  . 

perm(Q)  = 

where  the  sum  is  taken  over  all  permutations  a  on  the  numbers  l,...,n.  Let  Il(n)  denote  the 
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set  of  all  n  X  n  permutation  matrices,  i.e.  the  set  of  all  n  X  n  matrics  A  =  „ 

where  are  nonnegative  integers  such  that  =  i  for  all  k  =  and 

=  1  for  all  j  =  (In  other  words,  each  row  and  each  column  of  A  contains  at  most 

one  nonzero  entry,  which  must  in  turn  be  1.)  Then  we  can  write 


perm{Q) 
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Now,  let 

«  \ 
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where  5o(x)  is  the  Kronecker  delta  function  defined  by  5o(x)  =  1  if  x  =  0  and  8q(x)  =  0 
otherwise.  Then  D(A)  =0,1  and  D(A)  =  1  if  and  only  if  =  1  for  all  k  = 

Consequently,  we  can  write 

perm(Q)  =  11 

aen*(n)  M=i 


where  ll*(n)  denotes  the  set  of  all  n  x  n  matrices  A  such  that  djj^  are  nonnegative  integers 
and  such  that  djj^  =  1  for  all  j  = 

Next,  approximate  dg  as  a  limit  of  exponential  functions, 

6o(x)  =  lim  =  lim  f”  ,  dy  =  ^  dy 

./ItTn 

and  substitute  the  integral  expression  into  D(A).  Then  after  a  little  algebra  we  get: 

oo  oo 

perm(Q)  =  lim  f  ""  f  ® 

-»  Aen*(n) 


Now,  let  us  be  given  any  family  of  numbers  m(A)  =  ^j.k  ^j.k  indexed  by  the  matrices 
A  G  Il*(n),  where  B  =  „  is  any  fixed  real- valued  nXn  matrix.  Then  note  that 

m(A)  =  (djibij+  ...  +djnbjJ  +  (d2_ib2j+  ...  +a2.A«)  +  •••  +  KAj+  ...  +a„,A,J 
where  djj  -f  ...  H-  dj„  =  1  for  every  j  =  is  identical  to  the  family 
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+  ...  +  ^  bj 

indexed  by  the  integers  k^  ,...,k„  =  l,...,n.  Therefore  we  can  write: 


perm(Q)  =  lim  f  f  e  dy^—^ly 

N->«>  *’  k  k  -  \ 

—  oo  —oo  »***>^n  •*• 


(In  other  words,  the  combinatorial  sum  has  been  replaced  by  a  non-combinatorial  sum.)  More 
algebra  and  a  substimtion  of  variables  leads  to  the  following  complex  line  integral: 
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The  complex  line  integrals  can  then  be  approximated  using  the  saddle  point  method  described 
earlier.  The  mean-field  approximation  has  the  effect  of  reducing  the  computational  complexity 
of  perm(Q),  which  is  of  order  2",  to  an  approximation  of  order  n\  For  more  details  see  [24], 

4-C.  LIST  OF  PUBLICATIONS  AND  TECHNICAL  REPORTS 

The  following  publications  were  generated  under  this  contract. 

4-C-a.  Books  (Work  Supported  by  USARO) 

1.  I.R.  Goodman,  R.P.S.  Mahler,  and  H.T.  Nguyen  (1997)  Mathematics  of  Data  Fusion, 
Kluwer  Academic  Publishers 

2.  J.  Goutsias,  R.  Mahler,  and  H.T.  Nguyen,  eds.  (1997)  Theory  and  Applications  of 
Random  Sets,  Springer- Verlag 

4-C-b.  Conference  and  Other  Papers  Published  in  Books  (Work  Supported  by  USARO) 

1.  R.  Mahler  (1997)  "A  Theoretical  Unification  of  Knowledge-Based  Systems  With 
Multisensor,  Multitarget  Estimation,"  in  P.  Wang,  Qd.,  Advances  in  Machine  Intelligence 
and  Soft  Computing,  Vol.  IV,  Duke  University  Dept,  of  Elec.  Eng.,  to  appear. 


34 


2.  R.  Mahler  (1996/1994)  "A  Unified  Foundation  for  Data  Fusion,"  in  F.A.  Sadjadi,  ed., 
Selected  Papers  on  Sensor  and  Data  Fusion,  SPIE  Vol.  MS-124,  SPIE  Optical 
Engineering  Press,  Bellingham  WA,  pp.  325-345;  reprinted  from  Proc.  7th  Joint  Service 
Data  Fusion  Symposium,  Wol.  I  Part  1  (Unclassified),  Oct.  25-28  1994,  Johns  Hopkins 
Applied  Physics  Laboratory,  Laurel  MD,  Naval  Air  Development  Center,  Warminster 
PA,  pp.  153-173 

3.  R.  Mahler  (1995)  "Finite-set  statistics  with  application  to  data  fusion,"  in  A.  Friedman, 
ed..  Mathematics  in  Industrial  Problems,  Part  7,  Springer-Verlag,  pp.  198-206 

4.  R.  Mahler  (1994)  "Systematic  data  fusion  using  the  theory  of  conditional  random  sets," 
in  A.  Friedman,  ed..  Mathematics  in  Industrial  Problems,  Part  6,  pp.  156-165 

4-C-c.  Conference  Papers  Published  in  Proceedings  (Work  Supported  by  USARO) 

1.  R.  Mahler  (1998)  "Global  posterior  densities  for  sensor  management,"  Proc.  1998  SPIE 
AeroSense  Conference,  April  13-17  1998,  Orlando,  to  appear 

2.  R.  Mahler  (1998)  "Information  for  fusion  management  and  performance  estimation," 
Proc.  1998  SPIE  AeroSense  Conference,  April  13-17  1998,  Orlando,  to  appear 

3  R.  Mahler  (1998)  "Multisource,  multitarget  filtering:  A  unified  approach,"  Proc.  1998 
SPIE  AeroSense  Conference,  April  13-17  1998,  Orlando,  to  appear 

4.  R.  Mahler  (1997)  "Decisions  and  Data  Fusion,"  Proc.  1997  National  Symposium  on 
Sensor  and  Data  Fusion,  April  14-17  1997,  M.I.T.  Lincoln  Laboratories,  to  appear 

5.  R.  Mahler  (1997)  "Measurement  models  for  ambiguous  evidence  using  conditional 
random  sets,"  Proc.  1997 SPIE  Aerosense  Conf,  April  21-25  1997,  Orlando,  to  appear 

6.  R.  Mahler  (1996)  "Global  Optimal  Sensor  Allocation, "  Proceedings  of  the  Ninth  National 
Symposium  on  Sensor  Fusion,  Vol.  I  (Unclassified),  Mar.  12-14  1996,  Naval 
Postgraduate  School,  Monterey  CA,  pp.  347-366 

7.  R.  Mahler  (1996)  "Unified  data  fusion:  fiizzy  logic,  evidence,  and  rules,"  in  1.  Kadar 
and  V.  Libby,  eds..  Signal  Processing,  Sensor  Fusion,  and  Target  Recognition  V,  SPIE 
Vol.  2755,  SPIE  Optical  Engineering  Press,  Bellingham  WA,  pp.  226-237 

8.  R.  Mahler  (1995)  "Information  Theory  and  Data  Fusion,"  Proceedings  of  the  Eighth 
National  Symposium  on  Sensor  Fusion,  Vol.  I  (Unclassified),  Dallas  TX,  March  15-17 
1995,  ERIM,  Ann  Arbor,  pp.  279-292 

9.  R.  Mahler  (1995)  Nonadditive  probability,  finite-set  statistics,  and  information  fusion" 
{Invited  Paper),  Proc.  34th  IEEE  Conf.  on  Decision  and  Control,  New  Orleans,  Dec. 


35 


1995,  pp.  1947-1952 


10.  R.  Mahler  (1995)  "Unified  nonparametric  data  fusion"  {Invited  Paper),  in  I.  Kadar  and 
V.  Libby,  eds..  Signal  Processing,  Sensor  Fusion,  and  Target  Recognition  TV,  SPIE  Vol. 
2484,  SPIE  Optical  Engineering  Press,  Bellingham  WA,  pp.  66-74 

11.  R.  Mahler  (1994)  "Global  Integrated  Data  Fusion"  (Invited  Paper),  Proceedings  of  the 
Seventh  National  Symposium  on  Sensor  Fusion,  Vol.  I  (Unclassified),  Sandia  National 
Laboratories,  Albuquerque  NM,  March  16-18  1994,  ERIM,  Ann  Arbor,  pp.  187-199 

12.  R.  Mahler  (1994),  "The  random  set  approach  to  data  fusion, "  in  F.A.  Sadjadi,  ed.. 
Automatic  Object  Recognition  IV,  SPIE  Vol.  2234,  SPIE  Optical  Engineering  Press, 
Bellingham  WA,  1994,  pp.  287-295 

4-C-d.  Journal  Articles  Published  During  Performance  Period  (Not  Supported  by  ARO) 

1.  D.  Fixsen  and  R.  Mahler  (1997)  "The  Modified  Dempster-Shafer  Approach  to 
Classification,"  IEEE  Trans.  Systems,  Man  and  Cybernetics,  Part  A,  vol.  27  no.  1,  pp. 
96-104 

2.  R.  Mahler  (1996)  "Combining  Ambiguous  Evidence  With  Respect  to  Ambiguous  a  priori 
Knowledge,  I;  Boolean  Logic,"  IEEE  Trans.  Systems,  Man  and  Cybernetics,  Part  A, 
vol.  26,  pp.  27-41 

3.  R.  Mahler  (1996)  "Representing  Rules  as  Random  Sets:  Statistical  Correlations  Between 
Rules,"  Information  Sciences,  vol.  88,  pp.  47-68 

4.  R.  Mahler  (1996)  "Representing  Rules  as  Random  Sets,  II:  Iterated  Rules,"  Intn’lJour. 
Intell.  Sys.,  vol.  11,  pp.  583-610 

5.  R.  Mahler  (1995)  "Combining  Ambiguous  Evidence  With  Respect  to  Ambiguous  a  priori 
Knowledge,  II:  Fuzzy  Logic,"  Fuzzy  Sets  and  Systems,  vol.  75,  pp.  319-354 

4-C-e.  Conference  Papers  Published  During  Performance  Period  (Not  Supported  by  ARO) 

1.  J.  Hatlestad,  R.  Mahler,  T.C.  Poling,  R.  Allen,  R.  Myre,  J.  Warner,  M.  Hatch,  E. 
Jahn,  and  J.  Kaina  (1997)  "ASW/ASUW  Parametric  ESM/ACINT  Multitarget  Classifier- 
Tracker,"  Proc.  1997  Combat  Identification  Systems  Conf,  to  appear 


36 


4-D.  LIST  OF  ALL  PARTICIPATING  SCIENTIFIC  PERSONNEL 


•  Ronald  P.S.  Mahler  POSITION  TITLE:  Staff  Scientist 
Education 


Institution  &  Location 

Degree 

Year 

Field  of  Study 

U.  of  Chicago,  Chicago  IL 

B.A. 

1970 

Mathematics 

U.  of  Minnesota,  Mpls  MN 

B.E.E. 

1980 

Electrical  Engineering 

Brandeis  U. ,  Waltham  MA 

Ph.D. 

1974 

'  Mathematics 

Dr.  Mahler  served  as  assistant  professor  of  mathematics  in  the  School  of  Mathematics 
of  the  University  of  Minnesota,  Mirmeapolis  MN,  from  1974-1979.  He  has  been 
employed  at  the  Eagan  MN  facility  of  Lockheed  Martin  since  1980.  He  is  currently  on 
the  program  committee,  and  is  a  session  chair,  of  the  data  fusion  conference  of  the  SPIE 
Aerosense  conference.  He  was  also  principal  organizer,  co-chair,  and  co-editor  for  a 
Workshop  on  Applications  and  Theory  of  Random  Sets,  held  at  the  Institute  for 
Mathematics  and  Its  Applications  (Mirmeapolis),  jointly  sponsored  by  ONR,  ARO,  and 
Lockheed  Martin.  The  proceedings  of  this  workshop  are  to  appear  in  hardcover 
(Springer-Verlag)  in  1997. 


•  Paul  Leavitt  POSITION  TITLE:  Software  Analyst 

Education 

Institution  &  Location  Degree  Year  Field  of  Study 

BSEE  1985  Electrical  Engineering 

BSPh  1979  Pharmacy 

Mr.  Leavitt  has  an  extensive  background  in  Real  Time  Computer  software  and  hardware 
design.  He  has  extensive  experience  in  real-world  applications,  including:  Air 
Superiority  Fighter  computational  requirements  analysis  for  the  ATE  Program;  1750A 
computer  design  and  microcode  design  for  the  ATE  and  B2  Programs.  He  has  Integrated 
GPS  Interferometry  with  Inertial  Navigation  components  to  provide  a  low  cost 
Agricultural  Navigation  System  (AGNAV).  He  has  an  excellent  background  in  data 
analysis  and  algorithm  design  and  implementation  in  such  areas  as  GPS  Carrier  Phase 
ambiguity  resolution  and  Attitude  determination,  Kalman  filtering.  He  has  proficiency 
in  communications,  such  as  Spread  Spectrum  Data  Links,  Tl,  El,  and  proprietary 
algorithms  used  in  communications  programs  such  as  ABCCC.  His  current  tasks  are 
related  to  implementation  of  proprietary  algorithms  on  the  Mercury  parallel  computer 
architecture. 


37 


38 


4 


6.  BIBLIOGRAPHY 


1.  R.T.  Antony  (1995)  Principles  of  Data  Fusion  Automation,  Artech  House,  Boston 

2.  R.A.  Braaldi  and  H.J.  Ryser  (1991)  Combinatorial  Matrix  Theory,  Cambridge  University 
Press,  Chapter  7 

3.  J.A.  Bucklew  (1990)  Large  Deviation  Techniques  in  Decision,  Simulation,  and  Estimation, 
John  Wiley  and  Sons 

4.  C.-Y.  Chong,  S.  Mori,  and  K.C.  Chang  (1990)  "Distributed  Multitarget,  Multisensor 
Tracking,"  in  Y.  Bar-Shalom,  &d.,  Multitarget-Multisensor  Tracking,  Artech  House,  pp.  247-295 

5.  D.J.  Daley  and  D.  Vere-Jorre,  An  Introduction  to  the  Theory  of  Point  Processes,  Springer- 
Verlag  1988 


6.  H.E.  Daniels,  "Saddlepoint  Approximations  in  Statistics,"  Annals  of  Math.  Stat.,  vol.  25, 
pp.  631-650 

7.  R.S.  Ellis  (19Z5)  Entropy,  Large  Deviations,  and  Statistical  Mechanics,  Springer- Verlag 

8.  D.  Fixsen  and  R.  Mahler  (1997)  "The  Modified  Dempster-Shafer  Approach  to 
Classification,"  IEEE  Trans.  Systems,  Man  and  Cybernetics,  Part  A,  vol.  27  no.  1,  pp.  96-104 

9.  I.R.  Goodman  (1994)  "A  New  Characterization  of  Fuzzy  Logic  Operators  Producing 
Homomorphism-Like  Relations  With  One-Point  Coverages  of  Random  Sets, "  in  P.P.  Wang,  ed. , 
Advances  in  Fuzzy  Theory  and  Technology,  Vol.  II,  pp.  131-157 

10.  I.R.  Goodman  (1983)  "A  unified  approach  to  modeling  and  combining  of  evidence  through 
random  set  theory,  Proc.  6th  MIT/ONR  Workshop  on  C3  Sysems,  Cambridge  MA,  pp.  42-47 

11.  I.R.  Goodman  (1982)  "Fuzzy  sets  as  equivalence  classes  of  random  sets,"  inR.  Yager,  ed.. 
Fuzzy  Sets  and  Possibility  Theory,  Permagon,  pp.  327-343 

12.  I.R.  Goodman,  R.P.S.  Mahler,  and  H.T.  Nguyen  (1997)  Mathematics  of  Data  Fusion, 
Kluwer  Academic  Publishers,  1997 


13.  I.R.  Goodman  and  H.T.  Nguyen  (1985)  Uncertainty  Models  for  Knowledge  Based  Systems, 
North-Holland,  Amsterdam 

14.  J.  Goutsias,  R.  Mahler,  and  H.T.  Nguyen,  eds.  (1997)  Theory  and  Applications  of  Random 
Sets,  Springer- Verlag,  1997 


39 


15.  M.  Grabisch,  H.T.  Nguyen,  and  E.A.  Walker  (1995)  Fundamentals  of  Uncertainty  Calculi 
With  Applications  to  Fuzry  Inference,  Kluwer  Academic  Publishers 

16.  L.  Greengard  (1994)  "Fast  Algorithms  for  Classical  Physics,"  Science,  vol.  265,  pp.  909- 
914 

17.  D.L.  Hall,  Mathematical  Techniques  in  Multisensor  Data  Fusion,  Artech  House,  1992 

18.  K.  Hestir,  H.T.  Nguyen,  and  G.S.  Rogers  (1991)  "A  random  set  formalism  for  evidential 
reasoning,"  inI.R.  Goodman,  M.M.  Gupta,  H.T.  Nguyen  and  G.S.  Rogers,  editors.  Conditional 
Logic  in  Expert  Systems,  North-Holland,  Amsterdam,  pp.  309-344 

19.  T.L.  Hill  (1987)  Statistical  Mechanics:  Principles  and  Practical  Applications,  Dover 
Publications 

20.  E.J.  Hinch  (1991)  Perturbation  Methods,  Cambridge  University  Press 

21.  S.J.  Julier  and  J.K.  Uhlmann  (1997)  "A  Practical  Real-Time  Algorithm  for  General 
Nonlinear  Filtering,"  Proc.  1997 SPIE  AeroSense  Conference,  April  20-25  1997,  Orlando  FL, 
Vol.  3068,  to  appear 

22.  S.J.  Julier,  J.K.  Uhlmann,  and  H.F.  Durrant- Whyte  (1995)  "A  New  Approach  for  Filtering 
Nonlinear  SYstem,"  Proc.  Amer.  Contr.  Con/.,  Seattle  WA,  pp.  1628-1632 

23.  N.  Karmarkar,  R.  Karp,  R.  Lipton,  L.  Lovas,  and  M.  Luby  (1993)  "A  Monte-Carlo 
Algorithm  for  Estimating  the  Permanent,"  SIAM  J.  Comput.,  vol.  22  no.  2,  pp.  284-293 

24.  K.  Kastella  (1995)  "Event- Averaged  Maximum  Likelihood  Estimation  and  Mean-Field 
Theory  in  Multitarget  Tracking,"  IEEE  Trans.  Aut.  Contr.,  vol.  40  no.  6,  pp.  1070-1074 

25.  D.G.  Kendall  (1963)  "Foundations  of  a  theory  of  random  sets,"  in  E.F.  Harding  and  D.G. 
Kendall,  editors.  Stochastic  Geometry,  J.  Wiley,  New  York,  pp.  322-376 

26.  R.  Kruse,  E.  Schwencke,  and  J.  Heinsohn  (1991)  Uncertainty  and  Vagueness  in  Knowledge- 
Based  Systems,  Springer- Verlag 

27.  N.N.  Lyshenko,  "Statistics  of  Random  Compact  Sets  in  Euclidean  Space,"  Journal  of  Soviet 
Mathematics,  Vol.  21  1983,  Plenum  Publishing,  pp.  76-92 

28.  R.  Mahler  (1997)  "Decisions  and  Data  Fusion,"  Proc.  1997 National  Symposium  on  Sensor 
and  Data  Fusion,  April  14-17  1997,  M.I.T.  Lincoln  Laboratories,  to  appear 

29.  R.  Mahler  (1997)  "Measurement  models  for  ambiguous  evidence  using  conditional  random 
sets,"  Proc.  1997  SPIE  Aero  sense  Conf,  April  21-25  1997,  Orlando,  to  appear 


40 


•  t  * 


30.  R.  Mahler  (1997)  "A  Theoretical  Unification  of  Knowledge-Based  Systems  With 
Multisensor,  Multitarget  Estimation,"  in  P.  Wang,  ed..  Advances  in  Machine  Intelligence  and 
Soft  Computing,  Vol.  IV,  Duke  University  Dept,  of  Elec.  Eng.,  to  appear. 

31.  R.  Mahler  (1996)  "Combining  Ambiguous  Evidence  With  Respect  to  Ambiguous  a  priori 
Knowledge,  I:  Boolean  Logic,"  IEEE  Trans.  Systems,  Man  and  Cybernetics,  Part  A,  vol.  26, 
pp.  27-41 

32.  R.  Mahler  (1996)  "Global  Optimal  Sensor  Allocation,"  Proceedings  of  the  Ninth  National 
Symposium  on  Sensor  Fusion,  Vol.  I  (Unclassified),  Mar.  12-14  1996,  Naval  Postgraduate 
School,  Monterey  CA,  pp.  347-366 

33.  R.  Mahler  (1996)  "Representing  Rules  as  Random  Sets:  Statistical  Correlations  Between 
Rules,"  Information  Sciences,  vol.  88,  pp.  47-68 

34.  R.  Mahler  (1996)  "Representing  Rules  as  Random  Sets,  II:  Iterated  Rules,"  Intn’l  Jour. 
Intell.  Sys.,  vol.  11,  pp.  583-610 

35.  R.  Mahler  (1996)  "Unified  data  fusion:  fuzzy  logic,  evidence,  and  rules,"  in  I.  Kadar  and 
V.  Libby,  eds..  Signal  Processing,  Sensor  Fusion,  and  Target  Recognition  V,  SPIE  Vol.  2755, 
SPIE  Optical  Engineering  Press,  Bellingham  WA,  pp.  226-237 

36.  R.  Mahler  (1996/1994)  "A  Unified  Foundation  for  Data  Fusion,"  in  F.A.  Sadjadi,  ed.. 
Selected  Papers  on  Sensor  and  Data  Fusion,  SPIE  Vol.  MS-124,  SPIE  Optical  Engineering 
Press,  Bellingham  WA,  pp.  325-345;  reprinted  from  Proc.  7th  Joint  Service  Data  Fusion 
Symposium,  Vol.  I  Part  1  (Unclassified),  Oct.  25-28  1994,  Johns  Hopkins  Applied  Physics 
Laboratory,  Laurel  MD,  Naval  Air  Development  Center,  Warminster  PA,  pp.  153-173 

37.  R.  Mahler  (1995)  "Combining  Ambiguous  Evidence  With  Respect  to  Ambiguous  a  priori 
Knowledge,  II:  Fuzzy  Logic,"  Fuzzy  Sets  and  Systems,  vol.  75,  pp.  319-354 

38.  R.  Mahler  (1995)  "Finite-set  statistics  with  application  to  data  fusion, "  in  A.  Friedman,  ed. , 
Mathematics  in  Industrial  Problems,  Part  7,  Springer- Verlag,  pp.  198-206 

39.  R.  Mahler  (1995)  "Information  Theory  and  Data  Fusion,"  Proceedings  of  the  Eighth 
National  Symposium  on  Sensor  Fusion,  No\.  I  (Unclassified),  Dallas  TX,  March  15-17  1995, 
ERIM,  Aim  Arbor,  pp.  279-292 

40.  R.  Mahler  (1995)  Nonadditive  probability,  fmite-s^t  statistics,  and  information  fusion" 
(Invited  Paper),  Proc.  34th  IEEE  Conf.  on  Decision  and  Control,  New  Orleans,  Dec.  1995,  pp 
1947-1952 

41.  R.  Mahler  (1995)  "Unified  nonparametric  data  fusion"  (Invited  Paper),  in  1.  Kadar  and  V. 
Libby,  eds..  Signal  Processing,  Sensor  Fusion,  and  Target  Recognition  IV,  SPIE  Vol.  2484, 


41 


SPIE  Optical  Engineering  Press,  Bellingham  WA,  pp.  66-74 

42.  R.  Mahler  (1994)  "Global  Integrated  Data  Fusion"  (Invited  Paper),  Proceedings  of  the 
Seventh  National  Symposium  on  Sensor  Fusion,  Vol.  I  (Unclassified),  Sandia  National 
Laboratories,  Albuquerque  NM,  March  16-18  1994,  ERIM,  Ann  Arbor,  pp.  187-199 

43.  R.  Mahler  (1994),  "The  random  set  approach  to  data  fusion,"  in  F.A.  Sadjadi,  ed.. 
Automatic  Object  Recognition  TV,  SPIE  Vol.  2234,  SPIE  Optical  Engineering  Press,  Bellingham 
WA,  1994,  pp.  287-295 

44.  R.  Mahler  (1994)  "Systematic  data  fusion  using  the  theory  of  conditional  random  sets,"  in 
A.  Friedman,  ed..  Mathematics  in  Industrial  Problems,  Part  6,  pp.  156-165 

45.  G.  Matheron  (1975)  Random  Sets  and  Integral  Geometry,  J.  Wiley,  New  York 

46.  I.S.  Molchanov  (1993)  Limit  Theorems  for  Unions  of  Random  Closed  Sets,  Springer- Verlag 
Lecmre  Notes  in  Mathematics  No.  1561 

47.  S.  Mori,  C.-Y.  Chong,  E.  Tse,  and  R.P.  Wishner  (1984)  "Multitarget  Multisensor  Tracking 
Problems,  Part  I:  A  General  Solution  and  a  Unified  View  on  Bayesian  Approaches,"  Revised 
Version,  Technical  Report  TR-1048-01,  Advanced  Information  and  Decision  Systems,  Inc., 
Mountain  View  CA,  August  1984.  My  thanks  to  Dr.  Mori  for  making  this  report  available  to 
me  (Dr.  Shozo  Mori,  personal  communication,  Feb.  28  1995) 

48.  S.  Mori,  C.-Y.  Chong,  E.  Tse,  and  R.P.  Wishner  (1986)  "Tracking  and  Classifying 
Multiple  Targets  Without  A  Priori  Identification,"  IEEE  Transactions  on  Automatic  Control, 
Vol.  AC-31  No.  5,  pp.  401-409 

49.  H.T.  Nguyen  (1978)  "On  random  sets  and  belief  functions,"  Journal  of  Mathematical 
Analysis  and  Applications,  Vol.  65,  pp.  531-542 

50.  S.D.  O’Neil  and  M.F.  Bridgland  (1991)  "Fast  Algorithms  for  Joint  Probabilistic  Data 
Association,"  Proceedings  of  the  Fourth  National  Symposium  on  Sensor  Fusion,  Vol.  I 
(Unclassified),  Apr.  2-4  1991,  Orlando,  ERIM,  Ann  Arbor  MI,  pp.  173-189 

51.  A. I.  Orlov  (1978)  "Fuzzy  and  random  sets"  (in  Russian),  Prikladnoi  Mnogomerni 
Statisticheskii  Analys,  Moscow 

52.  R.K.  Pathria  (1972)  Statistical  Mechanics,  Oxford  University  Press 

53.  P.  Quinio  and  T.  Matsuyama  (1991)  "Random  Closed  Sets:  A  Unified  Approach  to  the 
Representation  of  Imprecion  and  Uncertainty,"  in  R.  Kruse  and  P.  Siegel,  eds..  Symbolic  and 
Quantitative  Approaches  to  Uncertainty,  Springer-Verlag,  pp.  282-286 


42 


V  if 


54.  L.E.  Rasmussen  (1994)  "Approximating  the  Permanent:  A  Simple  Approach,"  Random 
Structures  and  Algorithms,  vol.  5  no.  2,  pp.  349-361 

55.  T.L.  Saaty  and  J.  Bram  (1964)  Nonlinear  Mathematics,  Dover  Publications 

56.  G.E.  Shilov  and  B.L.  Gurevich,  Integral,  Measure,  and  Derivative:  A  Unified  Approach, 
Prentice-Hall,  1966 

57.  P.  Smets  (1992)  "The  Transferable  Belief  Model  and  Random  Sets,"  International  Journal 
of  Intelligent  Systems,  Vol.  7,  pp.  37-46 

58.  L.  Tierney  and  J.B.  Kadane  (1986)  "Accurate  Approximations  for  Posterior  Moments  and 
Marginal  Densities,"  J.  Amer.  Stat.  Aw.,  vol.  81  no.  392,  pp.  82-86 

59.  R.B.  Washburn  (1987)  "A  random  point  process  approach  to  multiobject  tracking,"  Proc. 
American  Control  Conf,  vol.  3,  pp.  1846-1852 

60.  E.  Waltz  and  J.  Llinas,  Multisensor  Data  Fusion,  Artech  House,  Boston  1990 

61.  R.L.  Wheedon  and  A.  Zygmund  (1977)  Measure  and  Integral:  An  Introduction  to  Real 
Analysis,  Marcel  Dekker,  Inc. 

62.  H.W.  Sorenson  and  D.L.  Alspach  (1971)  "Recursive  Bayesian  Estimation  Using  Gaussian 
Sums,"  Automatica,  vol.  7,  pp.  465-479 

63.  D.L.  Alspach  (1975)  "A  Gaussian  Sum  Approach  to  the  Multi-Target  Identification- 
Tracking  Problem,"  Automatica,  vol.  11,  pp.  285-296 

64.  J.  T.-H.  Lo  (1994)  "Synthetic  Approach  to  Optimal  Filtering,"  IEEE  Trans.  Neural 
Networks,  vol.  5  no.  5,  pp.  803-811 

65.  N.  Portenko,  H.  Salehi,  and  A.  Skorokhod  (1997)  "On  optimal  filtering  of  multitarget 
tracking  systems  based  on  point  processes  observations,"  Random  Operators  and  Stochastic 
Equations,  vol.  1,  pp.  1-34 


43 


